Is a repository layer over sqlc over-engineering or necessary for scale? I’m building a notification engine in Go using sqlc for the DB layer. Do you just inject
*db.Queriesinto your services, or do you find the abstraction of a repository layer worth the extra code?
I attempted to answer it there and the gist is correct. But I wrote it in a hurry so the example and the explanation could be better. Capturing it properly here.
Call it repository or whatever you want, the name doesn’t matter. The point is that your
business logic should be oblivious to the persistence layer. Doesn’t matter if it’s sqlc,
raw database/sql, or gorm. If your service functions call sqlc queries directly, your
core logic is coupled to your database code. That makes it harder to test in isolation and
harder to swap out later.
Put a small interface between your business code and your storage code. The business side defines what it needs, the storage side satisfies it, and they live in separate packages.
Say you’re building a service that manages books. Start with the domain type and the storage interface:
// bookstore/bookstore.go
type Book struct {
ID int64
Title string
}
type BookStore interface {
Get(ctx context.Context, id int64) (Book, error)
Create(ctx context.Context, b Book) (int64, error)
}
The service depends only on that interface:
// bookstore/service.go
type Service struct {
store BookStore
}
func NewService(s BookStore) *Service {
return &Service{store: s}
}
func (s *Service) RegisterBook(
ctx context.Context, title string) (Book, error) {
b := Book{Title: title}
id, err := s.store.Create(ctx, b)
if err != nil {
return Book{}, err
}
b.ID = id
return b, nil
}
func (s *Service) GetBook(ctx context.Context, id int64) (Book, error) {
return s.store.Get(ctx, id)
}
RegisterBook doesn’t know about SQL, sqlc, or Postgres. It builds a Book, asks the store
to persist it, and gets an ID back.
The concrete implementation goes in a separate package. This is where sqlc-generated code would live:
// postgres/store.go
type Store struct{ db *sql.DB }
func NewStore(db *sql.DB) *Store { return &Store{db: db} }
func (s *Store) Get(ctx context.Context, id int64) (bookstore.Book, error) {
// sqlc query or raw sql, doesn't matter
// ...
}
func (s *Store) Create(
ctx context.Context, b bookstore.Book) (int64, error) {
// INSERT INTO books (title) VALUES ($1) RETURNING id
// ...
}
Wire it up at startup:
// cmd/main.go
store := postgres.NewStore(db)
svc := bookstore.NewService(store)
In tests, swap in a fake that satisfies the same interface:
// bookstore/service_test.go
var _ BookStore = (*memStore)(nil)
type memStore struct {
mu sync.Mutex
data map[int64]Book
next int64
}
func (m *memStore) Get(
ctx context.Context, id int64) (Book, error) {
m.mu.Lock()
defer m.mu.Unlock()
b, ok := m.data[id]
if !ok {
return Book{}, fmt.Errorf("book %d not found", id)
}
return b, nil
}
func (m *memStore) Create(
ctx context.Context, b Book) (int64, error) {
m.mu.Lock()
defer m.mu.Unlock()
m.next++
b.ID = m.next
m.data[b.ID] = b
return b.ID, nil
}
Now the test reads exactly like production code, minus Postgres:
// bookstore/service_test.go
func TestRegisterBook(t *testing.T) {
store := &memStore{data: make(map[int64]Book)}
svc := NewService(store)
b, err := svc.RegisterBook(context.Background(), "DDIA")
if err != nil {
t.Fatal(err)
}
if b.ID == 0 {
t.Fatal("expected non-zero ID")
}
if b.Title != "DDIA" {
t.Fatalf("got title %q, want DDIA", b.Title)
}
}
Same service code, no database needed. The test exercises RegisterBook without touching
SQL. If the storage layer changes tomorrow, the service and its tests stay the same.
etcd is a distributed key-value store where the server and client communicate over gRPC. But
if you’ve only ever used clientv3 and never peeked into the internals, you wouldn’t know
that. You call resp, err := client.Put(ctx, "key", "value") and get back a *PutResponse.
It feels like a regular Go library. The fact that gRPC and protobuf are involved is an
implementation detail that the client wrapper keeps away from you.
I’ve been building a few gRPC services at work lately, and I keep running into the same
question: what API do the users of my client library see? The server ships as a binary. The
client ships as a Go package that other teams go get. If I hand them the raw generated
gRPC stubs, they have to import my protobuf types, manage gRPC connections, configure TLS,
and parse codes.NotFound from google.golang.org/grpc/status. That’s a lot of protocol
plumbing for someone who just wants to consume my service.
This post walks through wrapping a generated gRPC client behind a higher level Go API, following the same pattern etcd uses. The idea is to give the user a wrapper client that abstracts out the generated client.
I’ll use a small in-memory KV store as the running example.
kv/
├── api/
│ ├── kv.proto # service definition
│ ├── kv.pb.go # generated message types
│ └── kv_grpc.pb.go # generated client and server stubs
├── client/
│ └── client.go # the wrapper (what users import)
├── server/
│ └── main.go # the server binary
└── go.mod
api/ holds the proto and generated code. server/ is a binary you deploy. client/ is
the library you ship. Other teams add it to their go.mod and never touch proto types
directly.
The KV store has three RPCs: put, get, and delete.
// api/kv.proto
syntax = "proto3";
package kvpb;
option go_package = "example.com/kv/api";
service KV {
rpc Put(PutRequest) returns (PutResponse);
rpc Get(GetRequest) returns (GetResponse);
rpc Delete(DeleteRequest) returns (DeleteResponse);
}
message PutRequest { string key = 1; bytes value = 2; }
message PutResponse {}
message GetRequest { string key = 1; }
message GetResponse { bytes value = 1; optional bool found = 2; }
message DeleteRequest { string key = 1; }
message DeleteResponse {}
GetResponse uses optional bool found because proto3 normally can’t distinguish “field is
zero” from “field was never set.” The optional keyword generates a pointer in Go, which
lets callers tell a missing key apart from an empty value.
Running protoc on this generates a client interface and a server stub. The client side
looks like this:
// api/kv_grpc.pb.go (generated)
type KVClient interface {
Put(ctx context.Context, in *PutRequest,
opts ...grpc.CallOption) (*PutResponse, error)
Get(ctx context.Context, in *GetRequest,
opts ...grpc.CallOption) (*GetResponse, error)
Delete(ctx context.Context, in *DeleteRequest,
opts ...grpc.CallOption) (*DeleteResponse, error)
}
Every method takes a context.Context, a protobuf request struct, and variadic
grpc.CallOptions, and returns a protobuf response plus an error. Anyone calling the
service has to import protobuf types, construct request structs like &api.PutRequest{},
and understand gRPC call options, even for a simple “get this key” call.
The server implements the other side with an in-memory map. What we care about for the
wrapper is that it returns a gRPC NOT_FOUND status when a key doesn’t exist. The wrapper
translates that into a Go sentinel error. Here’s the server code:
// server/main.go
type server struct {
kvpb.UnimplementedKVServer
data map[string][]byte
}
func (s *server) Get(
ctx context.Context, r *kvpb.GetRequest,
) (*kvpb.GetResponse, error) {
v, ok := s.data[r.Key]
if !ok {
return nil, status.Errorf(
codes.NotFound, "key %q", r.Key)
}
return &kvpb.GetResponse{
Value: v, Found: proto.Bool(true),
}, nil
}
// Put and Delete follow the same shape.
The server embeds UnimplementedKVServer, the standard gRPC pattern. It provides no-op
implementations for all RPCs so the code compiles even before you’ve written the real logic.
The Get method checks the map and returns codes.NotFound when the key isn’t there. This
is the status code the wrapper will catch and turn into a Go error. I’ve elided Put and
Delete since they follow the same structure.
Without a wrapper, callers use the generated KVClient directly. Pay attention to the
imports:
// example/main.go (raw usage without wrapper)
import (
"context"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"example.com/kv/api"
)
// ...
conn, err := grpc.NewClient("localhost:9090",
grpc.WithTransportCredentials(insecure.NewCredentials()))
// ...
kv := api.NewKVClient(conn)
_, err = kv.Put(ctx, &api.PutRequest{
Key: "greeting", Value: []byte("hello"),
})
Three imports just to put a key. The caller manages the gRPC connection, constructs
&api.PutRequest{} structs for every call, and has to parse gRPC status codes to check if a
key exists. For internal code where everyone knows gRPC, this is fine. For a library you
ship to other teams, it’s a lot of ceremony.
This is the API we actually want to give our users. Same sequence as before (put a key, get it back, handle a missing key) but without any gRPC or protobuf leaking through:
// example/main.go (with the wrapper)
import "example.com/kv/client"
// ...
c, err := client.New("localhost:9090")
// ...
defer c.Close()
err = c.Put(ctx, "greeting", []byte("hello"))
val, err := c.Get(ctx, "greeting")
_, err = c.Get(ctx, "missing")
if errors.Is(err, client.ErrNotFound) { ... }
One import instead of three. No gRPC or protobuf packages in sight. Put takes a string and
a byte slice. Get returns []byte. Missing keys come back as client.ErrNotFound,
checked with errors.Is like any other Go error. The caller doesn’t need to know that gRPC
is involved at all.
Note
Callers never have to build an api.PutRequest, call grpc.NewClient, configure TLS, or
check codes.NotFound. They pass strings and byte slices, get Go errors back, and the
wrapper handles the rest.
The rest of this post builds the wrapper that turns the generated KVClient from the
previous section into this API.
The client/ package is the only thing users import. It hides the generated api.KVClient
behind a struct and re-exposes the same operations using plain Go types. The whole wrapper
lives in a single file (client/client.go).
The wrapper starts with a sentinel error and a testable interface:
// client/client.go
var ErrNotFound = errors.New("key not found")
type KV interface {
Put(ctx context.Context, key string, value []byte) error
Get(ctx context.Context, key string) ([]byte, error)
Delete(ctx context.Context, key string) error
}
ErrNotFound replaces the gRPC NOT_FOUND status code. Callers check it with errors.Is
and never import google.golang.org/grpc/codes.
Client implements KV, and KV uses only standard Go types instead of protobuf or gRPC
types. This is intentionally a producer-side interface: we define it in the same package as
Client because we know the full set of operations the service supports and we want to
offer a ready-made contract for consumers. Other packages that depend on your client can
accept a KV in their function signatures and swap in a simple in-memory fake during tests
without spinning up a gRPC server or importing any gRPC packages.
Important
KV is a producer-side interface. I wrote about when these make sense in Revisiting
interface segregation in Go.
Then the struct and constructor:
type Client struct {
conn *grpc.ClientConn
kv api.KVClient
}
func New(addr string, opts ...grpc.DialOption) (*Client, error) {
if len(opts) == 0 {
opts = []grpc.DialOption{
grpc.WithTransportCredentials(insecure.NewCredentials()),
}
}
conn, err := grpc.NewClient(addr, opts...)
if err != nil {
return nil, fmt.Errorf("connecting to %s: %v", addr, err)
}
return &Client{conn: conn, kv: api.NewKVClient(conn)}, nil
}
func (c *Client) Close() error { return c.conn.Close() }
Client holds the gRPC connection and the generated api.KVClient as unexported fields.
Note that api.KVClient is an interface, not a concrete struct. The gRPC codegen doesn’t
expose the actual client struct at all; you get back a KVClient interface from
api.NewKVClient(conn). We store it as a regular field rather than embedding it. If you
embedded the api.KVClient interface, all its methods like
Put(ctx, *PutRequest, ...CallOption) would be promoted onto Client directly, and callers
could bypass the wrapper to make raw gRPC calls.
Warning
Don’t embed the generated client interface. Keep it as a private field so the only way to talk to the server is through the wrapper methods.
New creates the gRPC connection and builds the generated client from it. The variadic
grpc.DialOption lets callers pass custom TLS, keepalive, or interceptor config. If they
pass nothing, the default is insecure credentials for local dev. The retries section below
shows what a production setup looks like.
With the types in place, we can look at the wrapper methods. Get shows the pattern all
three follow:
func (c *Client) Get(ctx context.Context, key string) ([]byte, error) {
resp, err := c.kv.Get(ctx, &api.GetRequest{Key: key})
if err != nil {
if s, ok := status.FromError(err); ok &&
s.Code() == codes.NotFound {
return nil, ErrNotFound
}
return nil, fmt.Errorf(
"getting key %s: %v", key, err)
}
return resp.Value, nil
}
// Put and Delete follow the same shape.
Each wrapper method follows the same pattern: take the caller’s Go arguments, build the protobuf request internally, call the generated client, and return plain Go types.
Pay attention to the error handling. When the server returns NOT_FOUND, we catch that gRPC
status and convert it to our own ErrNotFound sentinel so callers can check it with
errors.Is instead of parsing gRPC status codes themselves. For everything else, we wrap
with %v instead of %w. If we used %w, callers could unwrap the error with errors.As
and reach the underlying gRPC status types, which would re-couple them to gRPC internals and
defeat the whole point of having a wrapper. I wrote about this tradeoff in Go errors: to
wrap or not to wrap?.
Since the wrapper owns the grpc.NewClient call, it can bake in retries and observability
without the caller knowing. gRPC interceptors work like HTTP middleware. They wrap every RPC
with extra logic (logging, retries, metrics) without changing the handler code. You register
them as dial options when creating the connection:
// client/client.go (production version of New)
func New(addr string, opts ...grpc.DialOption) (*Client, error) {
defaults := []grpc.DialOption{
grpc.WithTransportCredentials(credentials.NewTLS(&tls.Config{})),
grpc.WithChainUnaryInterceptor(
grpc_retry.UnaryClientInterceptor(
grpc_retry.WithMax(3),
grpc_retry.WithBackoff(
grpc_retry.BackoffExponential(100*time.Millisecond),
),
),
grpcprom.UnaryClientInterceptor,
),
}
opts = append(defaults, opts...)
// ... rest is the same
}
grpc_retry from go-grpc-middleware retries failed RPCs with exponential backoff.
grpcprom records latency histograms and error rates. Same client.New, same c.Put, but
now with retries and metrics baked in. Callers who need to override the defaults can pass
their own dial options. This is useful in tests where you might want insecure credentials or
no retries.
The full code is on GitHub. Install the server and run the example:
go install github.com/rednafi/examples/wrapping-grpc-client/server@latest
server &
go install github.com/rednafi/examples/wrapping-grpc-client/example@latest
example
Running the example will return:
put greeting=hello
get greeting=hello
get missing: not found (expected)
deleted greeting
get greeting after delete: not found (expected)
Or add the client library to your own project:
go get github.com/rednafi/examples/wrapping-grpc-client/client@latest
I was mainly looking for pointers on how to organize protobuf definitions, wire up server-side metrics and interceptors, and build ergonomic client wrappers. The default answer here is often “go read the Docker or Kubernetes codebase.” But both of those are pretty huge and take forever to get accustomed to.
Then I found etcd. It’s used by Kubernetes’ control plane for storing configs in a consistent manner. It exposes a small set of well-defined gRPC endpoints to interact with the storage layer. The core services are defined in a single rpc.proto file:
service KV {
rpc Range(RangeRequest) returns (RangeResponse);
rpc Put(PutRequest) returns (PutResponse);
rpc DeleteRange(DeleteRangeRequest) returns (DeleteRangeResponse);
rpc Txn(TxnRequest) returns (TxnResponse);
rpc Compact(CompactionRequest) returns (CompactionResponse);
}
// ...
The full file also defines Watch, Lease, Cluster, Maintenance, and Auth services.
Grokking that file and the surrounding api directory is a good way to learn how to organize
your protobufs and generated code. Some other things I picked up:
Proto definitions live under api/, separated into subpackages like etcdserverpb,
mvccpb, authpb. Generated Go code lives alongside the proto files.
The RPC handler implementations live under server/etcdserver/api/v3rpc. key.go
implements the KV service (Range, Put, DeleteRange, Txn, Compact), and the other services
follow the same pattern in watch.go, lease.go, member.go, maintenance.go, auth.go.
grpc.go shows how to assemble a gRPC server with chained unary and stream interceptors using go-grpc-middleware.
Server-side Prometheus metrics are wired in grpc.go via grpc_prometheus.ServerMetrics
interceptors. It optionally enables latency histograms when the metric type is extensive.
metrics.go defines custom Prometheus counters and histograms on top of the standard gRPC
ones, things like etcd_network_client_grpc_sent_bytes_total and watch stream durations.
interceptor.go handles logging. newLogUnaryInterceptor logs request/response sizes at
warn level when latency exceeds a threshold.
The client has no built-in metrics. The clientv3 README says you can wire up go-grpc-prometheus yourself, but the library doesn’t do it for you.
retry_interceptor.go implements client-side retry with backoff, safe retry classification for read-only vs mutation RPCs, and auth token refresh on failure.
The clientv3 package wraps the generated gRPC client behind a nicer Go API. Good reference if you’re building an ergonomic client on top of raw protobuf types.
If you’re a distributed systems nerd, etcd uses Raft for consensus. That part of the codebase is its own rabbit hole.
This has become my go-to whenever I’m wiring up another gRPC service at work. I’ve gotten comfortable enough with it over the last few weeks that I can point people to specific files when we need to make decisions.
]]>fmt.Errorf("doing X: %w", err)%v instead of %wreturn errThere’s no consensus, and the answer changes depending on the kind of application you’re writing. The Go 1.13 blog already covers the mechanics and offers some guidance, but I wanted to collect more evidence of what people are actually doing in the open and share what’s worked for me.
Here’s a function that places an order by calling into a few different packages:
func placeOrder(ctx context.Context, req OrderReq) error {
user, err := users.Get(ctx, req.UserID)
if err != nil {
return err
}
err = inventory.Reserve(ctx, req.ItemID, req.Qty)
if err != nil {
return err
}
err = payments.Charge(ctx, user.PaymentID, req.Total)
if err != nil {
return err
}
return saveOrder(ctx, user.ID, req.ItemID)
}
All four calls can fail with connection refused. When one of them does, your log says:
connection refused
Which call? No idea. You grep the codebase, add temporary logging, narrow it down. In a service with dozens of dependencies, debugging this trail of errors can turn into a huge time sink.
One obvious fix is to wrap the error at every return site:
user, err := users.Get(ctx, req.UserID)
if err != nil {
return fmt.Errorf("getting user %s: %w", req.UserID, err)
}
err = inventory.Reserve(ctx, req.ItemID, req.Qty)
if err != nil {
return fmt.Errorf("reserving stock for %s: %w", req.ItemID, err)
}
Now the log says:
reserving stock for item-123: connection refused
That tells you exactly which call failed and which item it was for.
Dave Cheney advocated for this in his 2016 talk Don’t just check errors. His pkg/errors
library introduced errors.Wrap, which adds a message and a stack trace at the point where
the error occurs. The idea is that each function knows what operation it was attempting, and
that context is lost if you don’t capture it immediately.
CockroachDB takes this further. They use cockroachdb/errors, a drop-in replacement for the
stdlib errors package that captures a stack trace at every wrap site:
// cockroachdb style: stack trace at every wrap
if err := r.validateCmd(ctx, cmd); err != nil {
return errors.Wrap(err, "validating command")
}
if err := r.stage(ctx, cmd); err != nil {
return errors.Wrap(err, "staging command")
}
The Terraform AWS provider does the same thing with fmt.Errorf("...: %w", err) at every
layer. Their contributor guidelines mandate a consistent format for all resource
operations:
// terraform-provider-aws style
output, err := conn.CreateVpc(ctx, input)
if err != nil {
return fmt.Errorf("creating EC2 VPC: %w", err)
}
d.SetId(aws.ToString(output.Vpc.VpcId))
if _, err := WaitVPCAvailable(ctx, conn, d.Id()); err != nil {
return fmt.Errorf(
"waiting for EC2 VPC (%s) available: %w",
d.Id(), err,
)
}
The wrapcheck linter codifies this as a rule. It doesn’t flag every bare return err,
only errors that originated from a different package:
func placeOrder(ctx context.Context, req OrderReq) error {
// users.Get is in another package: wrapcheck flags
user, err := users.Get(ctx, req.UserID)
if err != nil {
return err // not wrapped: linter warning
}
// validate is in the same package: wrapcheck allows
err = validate(req)
if err != nil {
return err // fine, same package
}
// ...
}
The reasoning is that when an error crosses a package boundary, the receiving code is the last place that knows what it was trying to do. Within a package, the caller already has that context.
For many cases, wrapping everything is the right default:
The risk of overwrapping, especially in my private code, is much lower than the risk of underwrapping when the service crashes and you get
io.EOF.
But wrapping has costs that only show up as the codebase grows.
When every layer wraps, your error messages become nested chains:
placing order: reserving stock for item-123:
checking warehouse: querying database:
connection refused
Four layers of context for one connection refused. The middle layers (checking warehouse
and querying database) don’t add a warehouse ID or a query. They just restate the call
chain.
It also makes the error string fragile. It changes whenever someone renames an
intermediate function or refactors the call chain. If you had an alert matching on
checking warehouse: querying database: connection refused, it breaks the moment someone
renames checkWarehouse to checkStock. The same root cause (connection refused) wrapped
through different code paths produces different error strings, making it hard to aggregate
them in your logging dashboard.
Jay Conrod’s error handling guidelines address this:
Each function is responsible for including its own values in the error message, except for arguments passed to the function that returned the wrapped error.
In other words, if os.Open already puts the file path in its error, your wrapper shouldn’t
add the path again:
// redundant: the path appears twice
return fmt.Errorf("opening %s: %w", path, err)
// open /etc/app.yaml: opening /etc/app.yaml: permission denied
// better: add what you were doing, not what Open already said
return fmt.Errorf("reading config: %w", err)
// reading config: open /etc/app.yaml: permission denied
The Google Go Style Guide says the same:
When adding information to errors, avoid redundant information that the underlying error already provides.
You should still wrap, but only when you’re adding information - a user ID, an item ID, the name of the external service you were calling.
Important
If a function is just passing through a call to another function within the same package, the wrapper is noise.
%w creates contracts you didn’t mean to%w in fmt.Errorf creates an error chain that callers can traverse with errors.Is and
errors.As. That means the wrapped error becomes part of your function’s API surface.
The Go 1.13 blog uses sql.ErrNoRows to illustrate this. Say your LookupUser function
calls database/sql internally:
func LookupUser(ctx context.Context, id string) (*User, error) {
row := db.QueryRowContext(ctx, "SELECT ...", id)
var u User
if err := row.Scan(&u.Name, &u.Email); err != nil {
return nil, fmt.Errorf(
"looking up user %s: %w", id, err,
)
}
return &u, nil
}
Because of %w, callers can now do errors.Is(err, sql.ErrNoRows) to check whether the
user wasn’t found. That works until you switch from database/sql to an ORM, or put a cache
in front of the query. The callers matching on sql.ErrNoRows silently break.
The Go 1.13 blog is explicit about this:
Wrapping an error makes that error part of your API. If you don’t want to commit to supporting that error as part of your API in the future, you shouldn’t wrap the error.
The Error Values FAQ makes the same point:
Callers can depend on the type and value of the error you’re wrapping, so changing that error can now break them. […] At that point, you must always return
sql.ErrTxDoneif you don’t want to break your clients, even if you switch to a different database package.
Same thing with typed errors. If your repository wraps a pgconn.PgError with %w, callers
can unwrap through to the Postgres error code:
if pgErr, ok := errors.AsType[*pgconn.PgError](err); ok {
log.Println(pgErr.Code) // e.g. "23505" (unique violation)
}
When you migrate to MySQL or put a cache in front of the database, those callers silently break.
The Google Go Style Guide notes that %w is appropriate when your package’s API
guarantees that certain underlying errors can be unwrapped and checked by callers. If you
don’t want to make that guarantee, use %v.
Important
%w makes the wrapped error part of your function’s API. Callers can errors.Is and
errors.As through it, which means they can start depending on the inner error type. If
you later change that inner error (swap databases, add a cache layer), those callers break.
Use %w only when you intend to expose the inner error.
%v as the conservative default%v adds the same context text (the human reading the log sees the identical message) but
severs the error chain. No caller can errors.Is or errors.As through it:
// %w: callers can errors.Is(err, sql.ErrNoRows)
return fmt.Errorf("getting user %s: %w", id, err)
// %v: same message text, but the chain is severed
return fmt.Errorf("getting user %s: %v", id, err)
Both produce the same log output. But with %v, you’re free to swap the database later
without breaking callers who were depending on the inner error type.
At system boundaries, the Google Go Style Guide recommends translating rather than wrapping:
At points where your system interacts with external systems like RPC, IPC, or storage, it’s often better to translate domain-specific errors into a standardized error space (e.g., gRPC status codes) rather than simply wrapping the raw underlying error with
%w.
Say your repository layer talks to Postgres via pgx. Wrapping with %w exposes pgx
errors to callers:
func (r *UserRepo) Get(ctx context.Context, id string) (*User, error) {
row := r.db.QueryRow(ctx, "SELECT ...", id)
if err := row.Scan(&u.Name, &u.Email); err != nil {
return nil, fmt.Errorf("getting user %s: %w", id, err)
}
return &u, nil
}
Now any caller can errors.Is(err, pgx.ErrNoRows), tying them to your database driver.
Translating means mapping the storage error into your own domain before it crosses the
boundary:
var ErrNotFound = errors.New("not found")
func (r *UserRepo) Get(ctx context.Context, id string) (*User, error) {
row := r.db.QueryRow(ctx, "SELECT ...", id)
if err := row.Scan(&u.Name, &u.Email); err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return nil, ErrNotFound
}
return nil, fmt.Errorf("getting user %s: %v", id, err)
}
return &u, nil
}
Callers check errors.Is(err, ErrNotFound) - which is yours - instead of
errors.Is(err, pgx.ErrNoRows). When you swap from Postgres to MySQL, callers don’t
break. And at system boundaries, consider translating entirely instead of wrapping.
The standard library also uses sentinel errors and custom error types alongside %w and
%v.
Packages like io define sentinel errors - package-level variables that callers check with
errors.Is. The io package defines EOF and returns it from Read when there’s no more
data:
// definition
var EOF = errors.New("EOF")
// inside a Reader implementation
func (r *myReader) Read(p []byte) (int, error) {
if r.pos >= len(r.data) {
return 0, io.EOF
}
// ...
}
A caller uses the sentinel to distinguish “end of input” from a real failure:
n, err := reader.Read(buf)
if errors.Is(err, io.EOF) {
// done reading, not an error
break
}
if err != nil {
return err
}
Sentinels work when the caller only needs to know which failure occurred. When callers
need structured metadata - not just identity - the stdlib uses custom error types. os.Open
defines a *fs.PathError struct and returns it with the operation name, file path, and
underlying syscall error as struct fields:
// definition in the fs package
type PathError struct {
Op string // "open", "read", "write"
Path string // the file path
Err error // the underlying syscall error
}
func (e *PathError) Unwrap() error { return e.Err }
// inside os.Open
func Open(name string) (*File, error) {
// ...
return nil, &PathError{Op: "open", Path: name, Err: err}
}
Because PathError implements Unwrap(), errors.Is(err, fs.ErrNotExist) works through
the chain. But unlike fmt.Errorf wrapping, the context is in typed struct fields. A caller
can extract those fields to decide what to do:
f, err := os.Open("/etc/app.yaml")
if err != nil {
if pathErr, ok := errors.AsType[*fs.PathError](err); ok {
// pathErr.Op is "open", pathErr.Path is "/etc/app.yaml"
// pathErr.Err is the syscall error (e.g. ENOENT)
log.Printf(
"%s failed on %s: %v",
pathErr.Op, pathErr.Path, pathErr.Err,
)
}
return err
}
net.OpError follows the same pattern with Op, Net, Source, Addr, and Err fields. The
package controls exactly what’s exposed via Unwrap(), and callers get structured metadata
they can act on programmatically.
The stdlib also uses fmt.Errorf with both %w and %v, and the database/sql package
shows why the choice matters. Rows.Scan wraps scanner errors with %w:
return fmt.Errorf(
`sql: Scan error on column index %d, name %q: %w`,
i, rs.rowsi.Columns()[i], err,
)
Before Go 1.16, Rows.Scan used %v here, which severed the chain. Custom Scanner implementations returning sentinel errors couldn’t be inspected with
errors.Is by callers. Issue #38099 fixed this by switching to %w. But in the same
package, internal type conversion errors use %v because the underlying strconv parse
error is an implementation detail callers don’t need to inspect:
return fmt.Errorf(
"converting driver.Value type %T (%q) to a %s: %v",
src, s, dv.Kind(), err,
)
The database/sql migration from %v to %w was safe because it only exposed more to
callers. Going the other direction would break callers who started depending on errors.Is.
Important
Going from %v to %w is a backwards-compatible change (it exposes more to callers).
Going from %w to %v is a breaking change (callers who relied on errors.Is or
errors.As through the chain will stop working). When in doubt, start with %v.
Kubernetes went through a similar migration. They historically used %v for most wrapping,
which meant errors.As couldn’t traverse the chain. Issue #123234 tracked the codebase-
wide migration from %v to %w, acknowledging that %v may still be preferred in some
places “to abstract the implementation details” but that such cases should be rare.
For most application code, fmt.Errorf with %w or %v is enough. Custom error types
like PathError make more sense in libraries and shared packages where callers need
structured metadata. But wrapping isn’t the only way to attach context to an error.
Dave Cheney is the person who created pkg/errors and popularized error wrapping in Go. He
eventually walked away from his own advice. In 2021, when looking for new maintainers for
pkg/errors, he wrote:
I no longer use this package, in fact I no longer wrap errors.
His reasoning was that structured logging can carry the debugging context that wrapping was meant to provide. Compare the two approaches. With wrapping, you bake the context into the error string:
err = inventory.Reserve(ctx, req.ItemID, req.Qty)
if err != nil {
return fmt.Errorf(
"reserving stock for %s: %w", req.ItemID, err,
)
}
The log line looks like:
reserving stock for item-123: connection refused
With structured logging, you keep the error value clean and attach the context as separate key-value fields:
err = inventory.Reserve(ctx, req.ItemID, req.Qty)
if err != nil {
slog.Error("reserve stock failed",
"item_id", req.ItemID,
"err", err,
)
return err
}
The log line looks like:
level=ERROR msg="reserve stock failed"
item_id=item-123 err="connection refused"
The same information is there, but in structured fields that your logging dashboard can
index, filter, and aggregate on. The error value itself stays as connection refused
without a chain of prefixes.
The tradeoff is that structured logging requires a logging pipeline that can query on fields.
If all you have is grep on a log file, the wrapping version is easier to work with.
Note
Structured logging and wrapping aren’t mutually exclusive. You can wrap at package
boundaries for the error string and log with slog at the handler for request-scoped
context (user IDs, request IDs, trace IDs). The handler example in the Services section
below does both.
So how do you actually decide? It depends on what you’re building. Marcel van Lohuizen from the Go team described his own approach:
I do and don’t… If I wanna have context, I wrap it. If I create a new error, I wrap it. But sometimes you’re not really adding too much information, and then I don’t. So it depends on the situation.
Be conservative. The Google style guide applies most directly here because you’re shipping
an API contract. Use %v by default so you don’t accidentally expose implementation
details. Use %w only when you intentionally want callers to inspect the inner error, and
document that you’re doing so.
A library that wraps with %w ties its callers to its dependencies. If v2 switches from
pgx to database/sql, every caller doing errors.Is(err, pgconn.something) breaks. Use
%v by default, and define your own sentinels when callers need to branch on the error:
var ErrNotFound = errors.New("item not found")
func (c *Client) Fetch(ctx context.Context, id string) (*Item, error) {
resp, err := c.http.Get(ctx, c.url+"/items/"+id)
if err != nil {
if isNotFound(err) {
return nil, ErrNotFound
}
return nil, fmt.Errorf("fetching item %s: %v", id, err)
}
// ...
}
Callers check errors.Is(err, ErrNotFound) - which is yours - without being coupled to
your HTTP client. Same pattern as the UserRepo translation example earlier.
Wrap freely with %w. The call stack is shallow, the error message is the user-facing
output, and nobody is calling errors.Is on your CLI’s errors. Maximum context helps the
human reading the terminal:
func run() error {
cfg, err := loadConfig(cfgPath)
if err != nil {
return fmt.Errorf("loading config %s: %w", cfgPath, err)
}
conn, err := connect(cfg.DatabaseURL)
if err != nil {
return fmt.Errorf("connecting to database: %w", err)
}
return migrate(conn)
}
The user sees:
loading config /etc/app.yaml:
open /etc/app.yaml: permission denied
In my experience, services are where it’s the hardest to give a formulaic answer to this. You have structured logging and distributed tracing, but you also have deep call stacks and many dependencies.
The approach I’ve landed on: wrap at package boundaries with context about what you were
trying to do. Use %w within your own codebase where callers should be able to inspect the
inner error. Use %v when the error crosses a system boundary (RPCs, database calls,
third-party APIs). Skip wrapping for same-package calls.
Here’s the placeOrder function from the beginning, rewritten:
func placeOrder(ctx context.Context, req OrderReq) error {
user, err := users.Get(ctx, req.UserID) // (1)
if err != nil {
return fmt.Errorf("getting user %s: %w", req.UserID, err)
}
err = inventory.Reserve(ctx, req.ItemID, req.Qty) // (2)
if err != nil {
return fmt.Errorf("reserving stock for %s: %w", req.ItemID, err)
}
err = payments.Charge(ctx, user.PaymentID, req.Total) // (3)
if err != nil {
return fmt.Errorf("charging payment: %w", err)
}
return saveOrder(ctx, user.ID, req.ItemID) // (4)
}
users.Get is in another package - wrap with the user IDinventory.Reserve is in another package - wrap with the item IDpayments.Charge is in another package - wrap with the operation nameAt the handler, use %v to translate into the external domain without exposing internals:
func handlePlaceOrder(
ctx context.Context, req *pb.OrderReq,
) (*pb.OrderResp, error) {
err := placeOrder(ctx, fromProto(req))
if err != nil {
slog.Error("placing order",
"user_id", req.UserId,
"item_id", req.ItemId,
"err", err,
)
// %v: context for humans, no chain for callers
return nil, status.Errorf(codes.Internal, "placing order: %v", err)
}
return &pb.OrderResp{}, nil
}
The handler logs the full error with request context for debugging, then returns a gRPC
status with %v so the caller gets a useful message without being able to errors.Is
through to your database driver.
There’s no consensus on how much to wrap, and I don’t think there needs to be. Here’s what I do:
return err. The caller already has context.fmt.Errorf("doing X: %w", err) with identifying info (user IDs,
item IDs, file paths). The wrapcheck linter can enforce this automatically. Only wrap
when you’re adding information the inner error doesn’t already carry.%v for the fallback path.%v by default. Own sentinels (ErrNotFound, ErrConflict) for cases
callers need to inspect. %w only when you intentionally want callers to unwrap, and
document that you’re doing so.%w everywhere. The error message is the user-facing output.slog at the handler level for request-scoped context,
so the error value doesn’t need to carry all of that.sync.Mutex next to the fields it protects:
var (
mu sync.Mutex
counter int
)
mu.Lock()
counter++
mu.Unlock()
This works, but nothing enforces it. The compiler won’t stop you from accessing counter
without holding the lock. Forget to lock in one spot and you have a data race. One way to
make this safer is to bundle the value and its mutex into a small generic wrapper that only
exposes locked access through methods:
type Locked[T any] struct {
mu sync.Mutex
v T
}
func NewLocked[T any](initial T) *Locked[T] {
return &Locked[T]{v: initial}
}
func (l *Locked[T]) Get() T {
l.mu.Lock()
defer l.mu.Unlock()
return l.v
}
func (l *Locked[T]) Set(v T) {
l.mu.Lock()
defer l.mu.Unlock()
l.v = v
}
You keep mu and v unexported, pass around *Locked[T], and callers use Get to read
and Set to write:
counter := NewLocked(0)
counter.Set(42)
fmt.Println(counter.Get()) // 42
Now callers can’t touch the underlying value without going through the lock. This doesn’t prevent misuse within the same package, but it makes unprotected access from other packages impossible.
This works fine when you’re replacing the value wholesale - just call counter.Set(42) and
move on. But when your mutation depends on the current value, Get and Set can race
against each other.
Say you want to increment the counter instead of replacing it. You’d have to do:
v := counter.Get()
v++
counter.Set(v)
Each individual call is safe - Get holds the lock while reading, Set holds it while
writing. But the three calls together aren’t atomic. Between Get and Set, another
goroutine can modify the value, and your increment overwrites theirs. That’s the classic
lost-update bug.
It gets worse with compound state. Say the wrapper holds a struct:
type State struct {
Count int
Name string
}
state := NewLocked(State{})
And you want to conditionally update both fields:
s := state.Get()
if s.Count < 10 {
s.Count++
s.Name = fmt.Sprintf("item-%d", s.Count)
}
state.Set(s)
Same problem. Get returns a copy, you mutate the copy, then Set writes it back. If
another goroutine modified state between those two calls, your write clobbers it.
Important
The race detector (go test -race) won’t catch this. It detects data races - two
goroutines accessing the same memory without synchronization. Here, every Get and Set
properly acquires the mutex, so each individual access is synchronized. The bug is a
logical race (lost update), not a data race. The race detector sees nothing wrong.
You can prove this with a simple test. Ten goroutines each increment the counter 1000 times, so the final value should be 10000:
func TestSetValue(t *testing.T) {
counter := NewLocked(0)
var wg sync.WaitGroup
for range 10 {
wg.Go(func() {
for range 1000 {
v := counter.Get()
v++
counter.SetValue(v)
}
})
}
wg.Wait()
got := counter.Get()
if got != 10000 {
t.Errorf("got %d, want 10000 (lost %d updates)", got, 10000-got)
}
}
Running go test -race produces no race warnings, but the test fails:
=== RUN TestSetValue
locked_test.go:30: got 1855, want 10000 (lost 8145 updates)
--- FAIL: TestSetValue (0.02s)
The race detector is silent. The updates are just gone.
Instead of taking a value, have Set take a function:
func (l *Locked[T]) Set(f func(*T)) {
l.mu.Lock()
defer l.mu.Unlock()
f(&l.v)
}
Now the counter increment becomes:
counter.Set(func(v *int) {
*v++
})
And the compound mutation:
state.Set(func(s *State) {
if s.Count < 10 {
s.Count++
s.Name = fmt.Sprintf("item-%d", s.Count)
}
})
The lock is held for the entire closure. There’s no gap between reading and writing, so no other goroutine can interfere. Both fields update together or not at all.
The function takes a pointer to T rather than a value of T for two reasons. First, it
lets you mutate the state in place instead of working on a copy. Second, if T is a large
struct, passing a pointer avoids copying the whole thing into the closure on every call.
Go’s database/sql package has an internal withLock helper that follows the same pattern:
// withLock runs while holding lk.
func withLock(lk sync.Locker, fn func()) {
lk.Lock()
defer lk.Unlock() // in case fn panics
fn()
}
It’s used throughout database/sql to serialize access to the underlying driver connection.
For example, when pinging a connection:
if pinger, ok := dc.ci.(driver.Pinger); ok {
withLock(dc, func() {
err = pinger.Ping(ctx)
})
}
Or when preparing a statement:
withLock(dc, func() {
si, err = ctxDriverPrepare(ctx, dc.ci, query)
})
Or committing a transaction:
withLock(tx.dc, func() {
err = tx.txi.Commit()
})
There are about 18 call sites in sql.go alone. In those snippets, dc is a
*driverConn - the struct that wraps a database driver connection. It embeds sync.Mutex
directly, so it satisfies sync.Locker and can be passed straight to withLock.
Note
withLock accepts sync.Locker instead of *sync.Mutex, so it also works with the read
side of an RWMutex:
withLock(rs.closemu.RLocker(), func() {
doClose, ok = rs.nextLocked()
})
Here rs.closemu is a sync.RWMutex, and .RLocker() returns a sync.Locker that
acquires the read lock. The same withLock function handles both cases.
In 2021, twmb filed proposal #49563 to add a Mutex.Locked(func()) method to the standard
library:
func (m *Mutex) Locked(fn func()) {
m.Lock()
defer m.Unlock()
fn()
}
The idea was that if sync.Mutex had this method natively, you wouldn’t need to write a
wrapper at all for simple cases - you’d just call mu.Locked(fn) directly. It also
eliminates forgotten unlocks and guards against panics leaving the mutex locked. esote
pointed out that database/sql already had an internal version of this - the same
withLock helper we saw earlier.
zephyrtronium raised the sync.Locker point:
I think there are advantages to making this a function that takes a Locker rather than a method on Mutex. This would allow using it with either end of an RWMutex, or another custom Locker.
rsc declined it on philosophical grounds:
In general we try not to have two different ways to do something, and for better or worse we have the current idioms.
The more interesting pushback came from bcmills, who argued the proposal didn’t go far enough. With generics arriving, he wanted something that also prevents unguarded access to the protected data, not just forgotten unlocks:
Now that we have generics on the way, I would rather see us move in a direction that also eliminates unlocked-access bugs, not just incrementally update
Mutexfor forgotten-deferbugs.
He sketched out what that could look like:
type Synchronized[T any] struct {
mu Mutex
val T
}
func (s *Synchronized[T]) Do(fn func(*T)) {
s.mu.Lock()
defer s.mu.Unlock()
fn(&s.val)
}
This is essentially the Locked[T] wrapper from the beginning of this post. The proposal
was declined, but bcmills’ suggestion is the direction the community ended up going
anyway-just outside the standard library.
Tailscale’s syncs package has a MutexValue[T] type that follows this direction:
type MutexValue[T any] struct {
mu sync.Mutex
v T
}
func (m *MutexValue[T]) WithLock(f func(p *T)) {
m.mu.Lock()
defer m.mu.Unlock()
f(&m.v)
}
func (m *MutexValue[T]) Load() T {
m.mu.Lock()
defer m.mu.Unlock()
return m.v
}
func (m *MutexValue[T]) Store(v T) {
m.mu.Lock()
defer m.mu.Unlock()
m.v = v
}
They provide both Store for simple replacements and WithLock for compound mutations.
When you need to read-modify-write, you go through WithLock so the lock covers the whole
operation.
If T is small and you only ever replace the whole value without reading it first, a plain
Set works. A boolean flag that gets toggled from one place, a config value that gets
swapped wholesale - those are fine.
But most state doesn’t stay that simple. You start with a single integer, it becomes a
struct with three fields, and now you need to update two of them based on the third. At that
point, Set(func(*T)) is the only safe option.
Important
The proposal benchmarks showed about 35% overhead for the closure-based approach (14.65
ns/op vs 10.82 ns/op for direct lock/unlock) due to closures and defer not being
inlineable. In practice this rarely matters. If your critical section does any real work,
the lock overhead dominates.
context canceled and
context deadline exceeded errors. These errors usually tell you that a context was
canceled, but not exactly why. In a typical client-server scenario, the reason could be any
of the following:
cancel() explicitlyGo 1.20 and 1.21 added cause-tracking functions to the context package that fix this, but
there’s a subtlety with WithTimeoutCause that most examples skip.
Here’s a function that processes an order by calling three services under a shared 5-second timeout:
func processOrder(ctx context.Context, orderID string) error {
ctx, cancel := context.WithTimeout(ctx, 5*time.Second) // (1)
defer cancel() // (2)
if err := checkInventory(ctx, orderID); err != nil {
return err // (3)
}
if err := chargePayment(ctx, orderID); err != nil {
return err
}
return shipOrder(ctx, orderID)
}
When a context gets canceled, the underlying reason is either context.Canceled or
context.DeadlineExceeded. Libraries wrap these in their own types (*url.Error for
net/http, gRPC status codes for grpc), but errors.Is still matches the sentinel.
So if checkInventory makes an HTTP call and the client disconnects while it’s in flight,
the error that bubbles all the way up is:
context canceled
If the 5-second timeout fires while chargePayment is waiting on a slow payment gateway:
context deadline exceeded
Two sentinel errors. No reason, no origin, nothing. The caller of processOrder has no idea
what actually happened.
You’d think wrapping the error helps:
if err := checkInventory(ctx, orderID); err != nil {
return fmt.Errorf("checking inventory for %s: %w", orderID, err)
}
Now the log says:
checking inventory for ord-123: context canceled
Better. You know it happened during the inventory check. But you still don’t know why the context was canceled. Was it the 5-second timeout? A parent context’s deadline? The client hanging up? A graceful shutdown signal? The error doesn’t say.
Without the cause, you can’t tell whether to retry, alert, or ignore, and your logs don’t give on-call enough to triage.
When this happens in production, you end up scanning logs for other errors around the same timestamp, hoping something nearby gives you a clue. If the logs don’t help, you trace the context from where it was created, through every function that receives it, looking for cancel calls and timeouts. In a small service this takes a few minutes. In a larger codebase with middleware, interceptors, and nested timeouts, it can take a lot longer.
This has been a known pain point in the Go community for years. Bryan C. Mills noted this in issue #26356 back in 2018:
I’ve seen this sort of issue crop up several times now. I wonder if
context.Contextshould record a bit of caller information… Then we could add a debugging hook to interrogate why a particularcontext.Contextwas cancelled.
On proposal #51365, which eventually led to the cause APIs, bullgare described the production experience:
I had a case when on production I got random “context canceled” log messages. And in the case like that you don’t even know where to dig and how to investigate it further. Or how to reproduce it on a local machine.
That proposal led to the cause APIs that shipped in go 1.20.
context.WithCancelCause gives you a CancelCauseFunc that takes an error instead of a
plain CancelFunc. Here’s the same processOrder rewritten to use it:
func processOrder(ctx context.Context, orderID string) error {
ctx, cancel := context.WithCancelCause(ctx)
defer cancel(nil) // (1)
if err := checkInventory(ctx, orderID); err != nil {
cancel(fmt.Errorf(
"order %s: inventory check failed: %w", orderID, err,
)) // (2)
return err
}
if err := chargePayment(ctx, orderID); err != nil {
cancel(fmt.Errorf(
"order %s: payment failed: %w", orderID, err,
))
return err
}
return shipOrder(ctx, orderID)
}
cancel(nil) as the default, sets the cause to context.Canceled%wNow you can read the cause with context.Cause(ctx). If checkInventory fails because of a
connection error, the cause comes back as:
order ord-123: inventory check failed: connection refused
Instead of just context canceled. You know it was the inventory check, you know it was a
connection error, and because the original error is wrapped with %w, the full error chain
is preserved for programmatic inspection.
The first call to cancel wins. Once a cause is recorded, subsequent calls are no-ops. So
defer cancel(nil) only takes effect if nothing else canceled the context first. This means
the most specific cancel, the one closest to the actual failure, is what gets recorded. If
checkInventory sets a cause and then defer cancel(nil) runs on the way out, the
inventory cause is preserved.
context.Cause is a standalone function rather than a method on Context because Go’s
compatibility promise means the Context interface can’t add new methods. Err() will
always return nil, Canceled, or DeadlineExceeded. If you call context.Cause on a
context that wasn’t created with one of the cause-aware functions, it returns whatever
ctx.Err() returns. On an uncanceled context, it returns nil.
This handles explicit cancellation, but the function still has no timeout. The original
version used WithTimeout for the 5-second deadline. To label that timeout with a cause, Go
1.21 added WithTimeoutCause:
ctx, cancel := context.WithTimeoutCause(
ctx,
5*time.Second,
fmt.Errorf("order %s: 5s processing timeout exceeded", orderID),
)
defer cancel()
When the timer fires, context.Cause(ctx) returns the custom error instead of a bare
context.DeadlineExceeded. There’s also WithDeadlineCause, which is the same thing but
takes an absolute time.Time. If all you need is a label on the timeout path,
WithTimeoutCause works. But there’s a subtlety in how it interacts with defer cancel()
that can silently discard your cause.
WithTimeoutCause returns (Context, CancelFunc), not (Context, CancelCauseFunc). The
cancel function you get back doesn’t accept an error argument. Proposal #56661 defined it
this way explicitly:
func WithTimeoutCause(
parent Context, timeout time.Duration, cause error,
) (Context, CancelFunc)
Think about what happens when processOrder finishes normally in 100ms, well before the
5-second timeout:
ctx, cancel := context.WithTimeoutCause(
ctx,
5*time.Second,
fmt.Errorf("order %s: 5s timeout exceeded", orderID),
)
defer cancel() // (1)
// ... returns in 100ms ...
cancel() fires on return, before the timerIf the timer fires first (the function ran too long), the context is canceled with
DeadlineExceeded and context.Cause(ctx) returns your custom message. That path works
correctly.
But if the function returns first, which is the common case, defer cancel() fires. Since
it’s a plain CancelFunc, it can’t take a cause argument. The Go source shows what it does
internally:
return c, func() { c.cancel(true, Canceled, nil) }
It passes Canceled with a nil cause. Your custom cause only gets recorded when the
internal timer fires. On the normal return path, the cause is just context.Canceled.
This isn’t a bug. WithTimeoutCause is a new function, so it could have returned
CancelCauseFunc. The Go team chose not to. rsc explained the reasoning when closing
proposal #51365:
WithDeadlineCauseandWithTimeoutCauserequire you to say ahead of time what the cause will be when the timer goes off, and then that cause is used in place of the genericDeadlineExceeded. The cancel functions they return are plainCancelFuncs(with no user-specified cause), notCancelCauseFuncs, the reasoning being that the cancel on one of these is typically just for cleanup and/or to signal teardown that doesn’t look at the cause anyway.
He also acknowledged that this creates a subtle distinction between the two APIs:
That distinction makes sense, but it makes
WithDeadlineCauseandWithTimeoutCausedifferent in an important, subtle way fromWithCancelCause. We missed that in the discussion…
So WithTimeoutCause only carries the custom cause when the timeout actually fires. On the
normal return path and on any explicit cancellation path, defer cancel() discards it. If
you have a middleware that logs context.Cause(ctx) for every request, it’ll see
context.Canceled instead of something useful on the most common path.
The way around this is to skip WithTimeoutCause and wire the timer yourself using
WithCancelCause. Since there’s only one CancelCauseFunc, every path goes through the
same door, and first-cancel-wins handles the rest. Here’s processOrder one more time:
func processOrder(ctx context.Context, orderID string) error {
ctx, cancel := context.WithCancelCause(ctx) // (1)
defer cancel(errors.New("processOrder completed")) // (2)
timer := time.AfterFunc(5*time.Second, func() {
cancel(fmt.Errorf("order %s: 5s timeout exceeded", orderID)) // (3)
})
defer timer.Stop() // (4)
if err := checkInventory(ctx, orderID); err != nil {
cancel(fmt.Errorf(
"order %s: inventory check failed: %w", orderID, err,
))
return err
}
if err := chargePayment(ctx, orderID); err != nil {
cancel(fmt.Errorf("order %s: payment failed: %w", orderID, err))
return err
}
return shipOrder(ctx, orderID)
}
CancelCauseFunc for everythingThree possible paths, one cancel function. If the timer fires, context.Cause(ctx) returns:
order ord-123: 5s timeout exceeded
If checkInventory fails with a connection error:
order ord-123: inventory check failed: connection refused
On normal completion:
processOrder completed
This is actually what the stdlib does internally; WithDeadline uses time.AfterFunc under
the hood.
The trade-off is that ctx.Err() always returns context.Canceled, never
context.DeadlineExceeded, because you’re using WithCancelCause instead of WithTimeout.
ctx.Deadline() also returns the zero value, which matters if downstream code or frameworks
use it to propagate deadlines (gRPC, for example, sends the deadline across service
boundaries via ctx.Deadline()). If downstream code branches on
errors.Is(err, context.DeadlineExceeded), that check won’t match either.
If downstream code relies on errors.Is(err, context.DeadlineExceeded) to distinguish
timeouts from explicit cancellations, stack a WithCancelCause on top of a
WithTimeoutCause:
func processOrder(ctx context.Context, orderID string) error {
ctx, cancelCause := context.WithCancelCause(ctx) // (1)
ctx, cancelTimeout := context.WithTimeoutCause( // (2)
ctx,
5*time.Second,
fmt.Errorf("order %s: 5s timeout exceeded", orderID),
)
defer cancelTimeout() // (3)
defer cancelCause(errors.New("processOrder completed")) // (4)
if err := checkInventory(ctx, orderID); err != nil {
cancelCause(fmt.Errorf(
"order %s: inventory check failed: %w", orderID, err,
))
return err
}
if err := chargePayment(ctx, orderID); err != nil {
cancelCause(fmt.Errorf(
"order %s: payment failed: %w", orderID, err,
))
return err
}
return shipOrder(ctx, orderID)
}
When the timeout fires, the inner context gets canceled with DeadlineExceeded and the
custom cause. errors.Is(ctx.Err(), context.DeadlineExceeded) works as expected. On the
error path, cancelCause(specificErr) cancels the outer context, which propagates to the
inner. On normal completion, cancelCause("processOrder completed") runs first because of
LIFO defer ordering, canceling the outer and propagating to the inner. Then
cancelTimeout() finds the inner already canceled and does nothing.
Note
Notice the defer ordering. cancelCause must be deferred after cancelTimeout so it
runs before it (LIFO). If you reverse them, cancelTimeout() cancels the inner context
with context.Canceled before cancelCause gets a chance to set a meaningful cause.
One subtlety: after line (2), ctx points to the inner context. If you call
context.Cause(ctx) on it after a cancelCause(specificErr) call, you’ll see
context.Canceled (propagated from the outer), not the specific error. The specific cause
lives on the outer context. In practice this doesn’t matter because the caller inspects the
returned error, not context.Cause, but it’s worth knowing if you add logging inside
processOrder itself.
The manual timer pattern is simpler and covers most cases. This stacked approach is for when
downstream code specifically relies on errors.Is(err, context.DeadlineExceeded).
context.Cause returns an error, so the full errors.Is and errors.As machinery works
on it. Since the cause in processOrder wraps the original error with %w, you can unwrap
through it to reach the underlying error.
If checkInventory failed because the inventory service refused the connection, the cause
is "order ord-123: inventory check failed: connection refused", and the wrapped error is a
*net.OpError. You can pull it out:
cause := context.Cause(ctx)
var netErr *net.OpError
if errors.As(cause, &netErr) {
// The inventory service is unreachable.
slog.Error("network failure",
"op", netErr.Op,
"addr", netErr.Addr,
)
}
errors.Is works the same way. If the timer cause had wrapped context.DeadlineExceeded
(e.g., with fmt.Errorf("order timeout: %w", context.DeadlineExceeded)), you could check
for it:
if errors.Is(context.Cause(ctx), context.DeadlineExceeded) {
// A timeout fired; maybe adjust the deadline or retry.
}
For logging, ctx.Err() and context.Cause(ctx) serve different purposes. ctx.Err()
gives you the category (cancellation or timeout), and context.Cause(ctx) gives you the
specific reason. Keeping them as separate structured log fields makes them easy to query:
if ctx.Err() != nil {
slog.Error("request failed",
"err", ctx.Err(),
"cause", context.Cause(ctx),
)
}
That produces:
level=ERROR msg="request failed" err="context deadline exceeded"
cause="order ord-123: 5s timeout exceeded"
A useful pattern is wrapping the request context with WithCancelCause at the middleware
level so every handler downstream gets automatic cause tracking. The cancel function is
stashed in the context via WithValue so handlers can pull it out and set a specific cause:
type cancelCauseKey struct{}
func withCause(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ctx, cancel := context.WithCancelCause(r.Context()) // (1)
defer cancel(errors.New("request completed")) // (2)
ctx = context.WithValue(ctx, cancelCauseKey{}, cancel) // (3)
next.ServeHTTP(w, r.WithContext(ctx))
if ctx.Err() != nil { // (4)
slog.Error("request context canceled",
"method", r.Method,
"path", r.URL.Path,
"err", ctx.Err(),
"cause", context.Cause(ctx),
)
}
})
}
WithCancelCausedefer cancel(...) hasn’t run yet at this
pointAny handler can pull the cancel function out and set a cause:
func handleOrder(w http.ResponseWriter, r *http.Request) {
cancel, _ := r.Context().
Value(cancelCauseKey{}).(context.CancelCauseFunc)
if err := processOrder(r.Context()); err != nil {
cancel(fmt.Errorf("order processing failed: %w", err))
http.Error(w, "order failed", http.StatusInternalServerError)
return
}
// ...
}
First cancel wins, so the most specific reason is what shows up in the middleware log.
streamingfast/substreams uses this approach in production, storing a CancelCauseFunc in
the request context so worker pools downstream can cancel with a specific error.
One thing to know: the stdlib’s HTTP server and most third-party libraries cancel contexts
without setting a cause, since they predate Go 1.20. If a client disconnects,
context.Cause(ctx) will return context.Canceled, not a custom error. The cause APIs are
most useful for reasons set by your own code.
Most of the time, WithCancelCause is all you need. It covers explicit cancellation with a
specific reason, and context.Cause gives you a way to read it back. If you also need a
timeout, WithTimeoutCause labels the deadline path without extra wiring. The gotcha is
that defer cancel() on the normal return path discards the cause, so if you need causes on
every path, including normal completion, the manual timer pattern fills that gap. The
stacked approach on top of that is for when downstream code also needs DeadlineExceeded.
The cause APIs have seen steady adoption since Go 1.20. golang.org/x/sync/errgroup uses
WithCancelCause internally since v0.3.0, so context.Cause(ctx) on an errgroup-canceled
context returns the actual goroutine error. docker cli uses it to distinguish OS signals
from normal cancellation. kubernetes cluster-api migrated its codebase to the *Cause
variants. gRPC-Go had a proposal to use it for distinguishing client disconnects from gRPC
timeouts and connection closures.
Runnable examples:
]]>go statement just launches a goroutine and walks away. There’s no scope
that waits for it, no automatic cancellation if the parent dies, no built-in way to collect
its errors.
This post looks at where the idea of structured concurrency comes from, what it looks like
in Python and Kotlin, and how you get the same behavior in Go using errgroup, WaitGroup,
and context.
In 1968, Dijkstra wrote a letter to the editor of Communications of the ACM titled Go To
Statement Considered Harmful. His core argument was that unrestricted use of goto made
programs nearly impossible to reason about:
The unbridled use of the go to statement has as an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress.
Structured programming replaced goto with scoped constructs like if, while, and
functions. The key insight was that control flow should be lexically scoped: you can look at
a block of code and know where it starts, where it ends, and that everything in between
finishes before execution moves on.
The same problem showed up later in concurrent programming.
Spawning a thread or goroutine that outlives its parent is the concurrency equivalent of
goto. The spawned work escapes the scope that created it, and now you have to reason about
lifetimes that cross boundaries.
Martin Sustrik, creator of ZeroMQ, coined the term “structured concurrency” in his Structured Concurrency blog post. He framed the idea as an extension of how block lifetimes work in structured programming:
Structured concurrency prevents lifetime of green thread B launched by green thread A to exceed lifetime of A.
Eric Niebler later expanded on Sustrik’s idea, tying it directly to how function calls work in sequential code:
“Structured concurrency” refers to a way to structure async computations so that child operations are guaranteed to complete before their parents, just the way a function is guaranteed to complete before its caller.
– Eric Niebler, Structured Concurrency (Niebler)
Nathaniel J. Smith (NJS) took this further in his essay Notes on structured concurrency:
That’s right: go statements are a form of goto statement.
NJS’s broader point was that spawning a background task breaks function abstraction the same
way goto does. Once a function can spawn work that outlives it, the caller can no longer
reason about when the function’s effects are complete:
Every time our control splits into multiple concurrent paths, we want to make sure that they join up again.
Structured concurrency boils down to a few rules:
This essay prompted Go proposal #29011, filed by smurfix, which proposed adding structured concurrency to Go. NJS participated in the discussion and made a point that stuck with me:
Right now you can structure things this way in Go, but it’s way more cumbersome than just typing
go myfunc(), so Go ends up encouraging the “unstructured” style.– Nathaniel J. Smith, Go proposal #29011
The proposal was eventually closed. Before getting into Go’s approach, it helps to see what structured concurrency actually looks like in practice across the three languages.
Python 3.11 introduced asyncio.TaskGroup as the structured concurrency primitive. Here’s an example that runs three tasks concurrently, where one of them fails:
import asyncio
async def fetch(url: str, should_fail: bool = False) -> str:
await asyncio.sleep(0.1) # (1)
if should_fail:
raise ValueError(f"failed to fetch {url}")
return f"fetched {url}"
async def main() -> None:
try:
async with asyncio.TaskGroup() as tg: # (2)
tg.create_task(fetch("/api/users")) # (3)
tg.create_task(fetch("/api/orders", should_fail=True))
tg.create_task(fetch("/api/products"))
except* ValueError as eg: # (4)
for exc in eg.exceptions:
print(f"caught: {exc}")
finally:
print("cleanup runs no matter what") # (5)
asyncio.run(main())
Here:
await is a cancellation point; the runtime can interrupt the coroutine hereasync with creates a scope that waits for all tasks to finishfinally runs regardless of success or failureThe thing that makes this work is that await expressions are cancellation points. When the
group decides to cancel, the runtime delivers that cancellation at the next await in each
running coroutine.
Kotlin has had structured concurrency since kotlinx.coroutines 0.26. The equivalent construct is coroutineScope. Here’s the same scenario with three tasks and one failure:
import kotlinx.coroutines.*
suspend fun fetch(url: String, shouldFail: Boolean = false): String {
delay(100) // (1)
if (shouldFail) throw IllegalStateException("failed to fetch $url")
return "fetched $url"
}
suspend fun main() {
try {
coroutineScope { // (2)
launch { fetch("/api/users") } // (3)
launch { fetch("/api/orders", shouldFail = true) }
launch { fetch("/api/products") }
}
} catch (e: IllegalStateException) { // (4)
println("caught: ${e.message}")
} finally {
println("cleanup runs no matter what") // (5)
}
}
Here:
delay is a suspension point where cancellation can be deliveredcoroutineScope waits for all children and cancels siblings if one failslaunch starts a coroutine tied to this scopefinally runs as expectedLike Python’s await, Kotlin’s suspension functions (delay, channel operations, etc.) are
cancellation points. When the scope cancels, the runtime delivers a CancellationException
at the next suspension point in each running coroutine.
Kotlin also has supervisorScope, which is the variant where siblings keep running when one fails. We’ll see the Go equivalent of that shortly.
Go’s go statement is unstructured. When you write go func() { ... }(), the runtime
spawns a background goroutine and immediately moves on. The calling function doesn’t wait
for it, doesn’t get notified when it finishes, and has no way to cancel it. Unless you
explicitly synchronize with something like a WaitGroup or a channel, that goroutine can
outlive the function that spawned it. There’s no built-in scope that ties their lifetimes
together.
But you can compose the same patterns using channels, sync.WaitGroup, context, and
errgroup from x/sync.
This is Go’s equivalent of TaskGroup and coroutineScope. Same scenario: three tasks, one
fails, siblings get cancelled:
func run() error {
g, ctx := errgroup.WithContext(context.Background()) // (1)
g.Go(func() error { // (2)
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(100 * time.Millisecond):
fmt.Println("fetched /api/users")
return nil
}
})
g.Go(func() error { // (3)
return fmt.Errorf("failed to fetch /api/orders")
})
g.Go(func() error { // (4)
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(100 * time.Millisecond):
fmt.Println("fetched /api/products")
return nil
}
})
err := g.Wait() // (5)
fmt.Println("cleanup runs no matter what")
return err
}
Here:
ctx.Done()Wait blocks until all goroutines finish and returns the first non-nil errorNotice how the Go version requires each goroutine to explicitly check
ctx.Done(). In Python and Kotlin, the runtime handles that atawait/suspension points. In Go, you wire it in yourself.
This is Go’s equivalent of Kotlin’s supervisorScope. Siblings keep running regardless of
individual failures:
func run() []error {
var (
wg sync.WaitGroup
mu sync.Mutex
errs []error
)
urls := []string{"/api/users", "/api/orders", "/api/products"}
for _, url := range urls {
wg.Go(func() { // (1)
time.Sleep(100 * time.Millisecond)
if url == "/api/orders" {
mu.Lock()
err := fmt.Errorf("failed to fetch %s", url)
errs = append(errs, err) // (2)
mu.Unlock()
return
}
fmt.Printf("fetched %s\n", url)
})
}
wg.Wait() // (3)
fmt.Println("cleanup runs no matter what")
return errs
}
Here:
Go launches a goroutine and handles Add/Done internally (Go 1.25+)Wait blocks until all goroutines finishThose two examples cover Go’s equivalents of the structured patterns in Python and Kotlin. But the code looks noticeably different, and the reason comes down to how these runtimes handle concurrent execution.
The fundamental difference between the Python/Kotlin approach and Go’s approach comes down to how cancellation gets delivered.
In Python, async def functions are coroutines. They run on a single-threaded event loop
and yield control at every await. In Kotlin, suspend functions are coroutines. They run
on cooperative dispatchers (which can be backed by thread pools) and yield at every
suspension point. Both languages have colored functions (async/suspend) - the “color”
means async functions can only be called from other async functions, which lets the runtime
track every point where a coroutine can yield. These yield points are also cancellation
points, so when a scope cancels, the runtime delivers the cancellation at the next such
point.
Go’s goroutines aren’t coroutines. They’re functions running on a preemptive scheduler
backed by OS threads. The runtime multiplexes goroutines onto OS threads and can preempt
them, but it has no knowledge of application-level cancellation. There’s no concept of a
“suspension point” where the runtime can inject a cancellation signal. A goroutine doing
CPU-bound work will keep running even if its context was cancelled. The goroutine has to
check ctx.Done() explicitly via a select statement.
Here’s the cooperative pattern in Go:
func worker(ctx context.Context) error {
for {
select {
case <-ctx.Done(): // (1)
return ctx.Err()
default:
doUnitOfWork() // (2)
}
}
}
And here’s a goroutine that ignores cancellation:
func busyWorker(ctx context.Context) {
for {
// CPU-bound work, never checks ctx.Done()
heavyComputation()
}
}
This goroutine will keep running until the process exits, regardless of whether its context was cancelled.
Python and Kotlin workers also need to cooperate for cancellation to actually work. If a
coroutine does CPU-bound work without hitting an await or a suspension point, the runtime
can’t interrupt it either.
In Python, a non-cooperative worker looks like this:
async def stubborn_worker() -> None:
while True:
heavy_computation() # (1)
await anywhere, so the runtime never gets a chance to deliver cancellationTo make it cooperative, you insert an explicit cancellation check:
async def cooperative_worker() -> None:
while True:
await asyncio.sleep(0) # (1)
heavy_computation()
await asyncio.sleep(0) yields control back to the event loop, giving it a chance to
cancel this coroutineIn Kotlin, the same situation looks like this:
suspend fun stubbornWorker() {
while (true) {
heavyComputation() // (1)
}
}
To fix this, use coroutineContext.ensureActive() to check whether the coroutine’s scope
has been cancelled:
suspend fun cooperativeWorker() {
while (true) {
coroutineContext.ensureActive() // (1)
heavyComputation()
}
}
CancellationException if the scope has been cancelledThis isn’t too different from what Go does with ctx.Done(). In all three languages, a
tight loop doing CPU-bound work won’t cancel unless the worker explicitly checks. The
difference is that in Python and Kotlin, most standard library functions (asyncio.sleep,
delay, channel operations) are cancellation points by default, so you hit them naturally
in typical code.
Go’s concurrency model is built on CSP (Communicating Sequential Processes). Goroutines
communicate via channels, not via structured scopes. The go statement is deliberately
low-level. It gives you a concurrent execution unit and gets out of your way.
Python and Kotlin start from the structured side and require you to opt out. Python’s
asyncio.create_task outside a group, or Kotlin’s supervisorScope, are the escape
hatches. Go starts from the unstructured side and requires you to opt in. errgroup and
WaitGroup are how you add structure. Different design priorities lead to different
defaults.
Go proposal #29011 was closed after Ian Lance Taylor pointed out the practical problem:
I think these ideas are definitely interesting. But your specific suggestion would break essentially all existing Go code, so that is a non-starter.
In a later comment, he acknowledged that there are good ideas in the space but argued for improving existing primitives rather than changing the language:
There are likely good ideas in the area of structured concurrency that we can do better at, in the language or the standard library or both.
NJS also noted that structured concurrency helps with error propagation, because when a
goroutine exits with an error, there is somewhere to propagate that error to. That’s a real
shortcoming of the current model. The response from the Go team was that errgroup,
context, and WaitGroup already provide the building blocks, and language-level changes
weren’t justified given the cost.
There’s also a Trio forum discussion on Go’s situation. NJS was cautious about overstating
the benefits, noting that structured concurrency wouldn’t have prevented about a quarter of
the concurrency bugs in a study on real-world Go bugs they examined (classic race
conditions). But he pointed out that some of the hardest-to-understand bugs involved
standard library modules that spawned surprising background goroutines. That couldn’t happen
in a language with truly scoped concurrency. He also observed that all mistakes in using
Go’s WaitGroup API seemed like they’d be trivially prevented by structured concurrency.
If you’re writing Go and want structured concurrency, there are a few practices that help. The core idea is:
Never start a goroutine without knowing when it will stop.
– Dave Cheney, Practical Go
Here are some concrete ways to follow that:
Know the lifetime of every goroutine you spawn. Before writing go func(), you should
be able to answer: what signals this goroutine to stop, and what waits for it to finish?
If you can’t answer both, the goroutine’s lifetime is unknown and it can leak.
Use go func() sparingly. A bare go func() { ... }() sends a goroutine into the
background with no handle to wait on it or cancel it. Prefer launching goroutines through
errgroup or behind a WaitGroup so something always owns their lifetime.
Let the caller decide concurrency. If you’re writing a library function, return a result instead of spawning a goroutine internally. Let the caller choose how to run it concurrently. This keeps goroutine lifetimes visible at the call site.
Pass context down, check it inside. Accept context.Context as the first parameter
and check ctx.Done() in long-running loops or blocking operations. This is how the
caller communicates “I don’t need this anymore.”
Here’s what a well-structured goroutine launch pattern looks like:
func processItems(ctx context.Context, items []string) error {
g, ctx := errgroup.WithContext(ctx) // (1)
for _, item := range items {
g.Go(func() error { // (2)
select {
case <-ctx.Done():
return ctx.Err()
default:
return handle(ctx, item) // (3)
}
})
}
return g.Wait() // (4)
}
Wait knows about itEvery goroutine has a clear owner and exit condition. If any task fails, the context cancels and the others observe it on their next check.
Since Go doesn’t enforce structured concurrency at the language level, it’s possible to leak goroutines or miss cancellation signals. I wrote about one common case in early return and goroutine leak.
There are a few tools that help catch these issues:
TestMain. It checks that no goroutines
are still running when your tests finish. It’s useful for catching the “forgot to cancel”
class of bugs, which is the most common way unstructured goroutines cause trouble.go test -race) catches data races between goroutines. It won’t catch
leaks, but unstructured goroutines with unclear lifetimes are more likely to race because
their access to shared state is harder to reason about.time.Sleep calls that make tests slow and flaky.runtime/pprof. It uses the
garbage collector’s reachability analysis to find goroutines permanently blocked on
synchronization primitives that no runnable goroutine can reach. Unlike goleak, which
only works in tests, this profile can be collected from a running program via
/debug/pprof/goroutineleak, making it useful for finding leaks in production.If you’re coming from languages like Python or Kotlin, Go’s concurrency can feel overly
verbose, and it is. Wiring up errgroup, checking ctx.Done() in every goroutine, guarding
shared state with a mutex around a WaitGroup; that’s a lot of ceremony for something the
other languages hand you for free.
But as covered earlier, the concurrency paradigms are fundamentally different. Python and Kotlin’s cooperative runtimes can own the cancellation because they own the scheduling. Go’s preemptive scheduler doesn’t know what your goroutine is doing or when it should stop. That’s why cancellation is your job.
The same structured patterns are all achievable in Go. You just build them yourself out of
errgroup, WaitGroup, context, and channels. That gives you more control over goroutine
lifetimes, but it also means more surface area for bugs. Forget a ctx.Done() check and a
goroutine leaks. Misuse a WaitGroup and you deadlock. The study on real-world Go bugs
found 171 concurrency bugs across projects like Docker and Kubernetes, with more than half
caused by Go-specific issues around message passing and goroutine management.
There are frameworks that generate those kind of fakes, and one of them is called GoMock… they’re fine, but I find that on balance, the handwritten fakes tend to be easier to reason about and clearer to sort of see what’s going on. But I’m not an enterprise Go programmer. Maybe people do need that, so I don’t know, but that’s my advice.
– Andrew Gerrand, Testing Techniques (46:44)
No shade against mocking libraries like gomock or mockery. I use them all the time, both at work and outside. But one thing I’ve noticed is that generating mocks often leads to poorly designed tests and increases onboarding time for a codebase.
Also, since almost no one writes tests by hand anymore and instead generates them with LLMs, the situation gets more dire. These ghosts often pull in all kinds of third-party libraries to mock your code, simply because they were trained on a lot of hastily written examples on the web.
So the idea of this post isn’t to discourage using mocking libraries. Rather, it’s to show that even if your codebase already has a mocking library in the dependency chain, not all of your tests need to depend on it. Below are a few cases where I tend not to use any mocking library and instead leverage the constructs that Go gives us.
This does require some extra song and dance with the language, but in return, we gain more control over our tests and reduce the chance of encountering spooky action at a distance.
Say you have a function that creates a database handle:
func OpenDB(user, pass, host, dbName string) (*sql.DB, error) {
dsn := fmt.Sprintf("%s:%s@tcp(%s)/%s", user, pass, host, dbName)
return sql.Open("mysql", dsn)
}
The problem is that sql.Open hands the DSN directly to the driver. When you call
OpenDB("admin", "secret", "db.internal", "orders"), the function formats the DSN string
and hands it to the MySQL driver. You can’t intercept that call, you can’t control what it
returns, and you probably don’t want unit tests leaning on a real driver (or a real MySQL
instance) just to verify DSN formatting.
The fix is to make the database opener injectable:
type SQLOpenFunc func(driver, dsn string) (*sql.DB, error) // (1)
func OpenDB(
user, pass, host, dbName string, openFn SQLOpenFunc, // (2)
) (*sql.DB, error) {
dsn := fmt.Sprintf("%s:%s@tcp(%s)/%s", user, pass, host, dbName)
return openFn("mysql", dsn) // (3)
}
Here:
sql.Open’s signaturesql.Open directlyIn production, pass the real sql.Open:
func main() {
db, err := OpenDB(
"admin", "secret", "db.internal", "orders", sql.Open, // (1)
)
// ...
}
Here:
sql.Open is passed as the last argument - no wrapper neededIn tests, pass a fake that captures what was passed or returns canned values:
func TestOpenDB(t *testing.T) {
var got string
fakeOpen := func(driver, dsn string) (*sql.DB, error) {
got = dsn // (1) capture what was passed
return nil, nil
}
OpenDB(
"admin", "secret", "db.internal", "orders", fakeOpen, // (2)
)
want := "admin:secret@tcp(db.internal)/orders"
if got != want {
t.Errorf("got %q, want %q", got, want)
}
}
Here:
This pattern works for any function dependency - UUID generators, random number sources, file openers. Functions are first-class values in Go, so you can pass them around like any other value.
The downside is that parameter lists can grow quickly. If OpenDB also needed a logger, a
metrics client, and a config loader, the signature becomes unwieldy. When you find yourself
passing more than two or three function dependencies, consider grouping them into a struct
with an interface - see Mocking a method on a type.
Sometimes you inherit code where refactoring the function signature isn’t practical. Maybe it’s called from dozens of places, or it’s part of a public API you can’t change:
func PublishOrderCreated(
ctx context.Context, brokers []string, id string) error {
w := &kafka.Writer{
Addr: kafka.TCP(brokers...), Topic: "order-events",
}
defer w.Close()
return w.WriteMessages(ctx, kafka.Message{Key: []byte(id)})
}
The Kafka writer is instantiated directly inside the function. There’s no seam to inject a fake without touching every call site. If this function is called from 50 places in your codebase, changing its signature means updating all 50.
One workaround is a package-level variable that points to the constructor:
type kafkaWriter interface { // (1)
WriteMessages(context.Context, ...kafka.Message) error
Close() error
}
var newWriter = func(brokers []string) kafkaWriter { // (2)
return &kafka.Writer{
Addr: kafka.TCP(brokers...), Topic: "order-events",
}
}
func PublishOrderCreated(
ctx context.Context, brokers []string, id string,
) error {
w := newWriter(brokers) // (3)
defer w.Close()
return w.WriteMessages(ctx, kafka.Message{Key: []byte(id)})
}
Here:
kafka.WriterProduction code doesn’t change - it calls PublishOrderCreated exactly as before, and the
default newWriter creates real Kafka writers.
Tests swap it out:
type fakeWriter struct {
key []byte
}
func (f *fakeWriter) WriteMessages(
_ context.Context, msgs ...kafka.Message) error {
if len(msgs) > 0 {
f.key = msgs[0].Key // (1)
}
return nil
}
func (f *fakeWriter) Close() error { return nil }
func TestPublishOrderCreated(t *testing.T) {
orig := newWriter
t.Cleanup(func() { newWriter = orig }) // (2) restore after test
fake := &fakeWriter{}
newWriter = func([]string) kafkaWriter { // (3)
return fake
}
PublishOrderCreated(
t.Context(), []string{"kafka:9092"}, "ord-1",
)
if got := string(fake.key); got != "ord-1" { // (4)
t.Errorf("got %q, want %q", got, "ord-1")
}
}
Here:
t.Cleanup ensures the original is restored even if the test failskafkaWriter, matching the
variable’s typeThis works, but be aware of the costs. Tests that mutate package state can’t run in
parallel - they’d stomp on each other’s fakes. If you’re writing tests from an external
package (package events_test), the variable must be exported, which pollutes your public
API.
Prefer the function parameter pattern or the interface pattern over monkey patching. Reserve this technique for legacy code where changing signatures would be too disruptive.
This is a pattern you’ll see all the time in services that integrate with third-party APIs.
Here’s a payment service that charges customers through Stripe (this uses the newer
stripe.Client API, which is the recommended shape in recent stripe-go versions):
func (s *Service) ChargeCustomer(
ctx context.Context, custID string, cents int64) (string, error) {
intent, err := s.client.V1PaymentIntents.Create(ctx,
&stripe.PaymentIntentCreateParams{
Amount: stripe.Int64(cents),
Currency: stripe.String("usd"),
Customer: stripe.String(custID),
},
)
if err != nil {
return "", err
}
return intent.ID, nil
}
Testing this hits the real Stripe API. That’s slow, requires live credentials, and in
production mode charges actual money. The problem is that s.client is a *stripe.Client
from the SDK - there’s no way to swap it for a fake without introducing a seam.
The solution is to introduce an interface that describes what you need:
type PaymentIntentCreator interface { // (1)
Create(
context.Context, *stripe.PaymentIntentCreateParams,
) (*stripe.PaymentIntent, error)
}
type Service struct {
intents PaymentIntentCreator // (2)
}
func (s *Service) ChargeCustomer(
ctx context.Context, custID string, cents int64) (string, error) {
intent, err := s.intents.Create(ctx, // (3)
&stripe.PaymentIntentCreateParams{
Amount: stripe.Int64(cents),
Currency: stripe.String("usd"),
Customer: stripe.String(custID),
})
if err != nil {
return "", err
}
return intent.ID, nil
}
Here:
In production, inject the real Stripe service client:
func main() {
sc := stripe.NewClient("sk_test_...")
svc := &Service{intents: sc.V1PaymentIntents} // (1)
// ...
}
Here:
sc.V1PaymentIntents satisfies PaymentIntentCreator (it has a Create method with
the right signature)In tests, you pass a fake that returns canned values:
type fakeIntents struct {
id string // (1)
}
func (f *fakeIntents) Create(
context.Context, *stripe.PaymentIntentCreateParams,
) (*stripe.PaymentIntent, error) {
return &stripe.PaymentIntent{ID: f.id}, nil // (2)
}
func TestChargeCustomer(t *testing.T) {
fake := &fakeIntents{id: "pi_123"} // (3)
svc := &Service{intents: fake}
id, _ := svc.ChargeCustomer(t.Context(), "cus_abc", 5000)
// assert id == "pi_123"
}
Here:
The service doesn’t know or care whether it’s talking to Stripe or a test fake. This is the most common mocking pattern in Go - define an interface for your dependency, accept it in your constructor, and swap implementations at runtime.
But what happens when the SDK surface area is huge and your code only needs one operation? That’s where the next pattern comes in.
The previous pattern works well when you control the interface. But AWS SDK clients have
dozens of methods. The DynamoDB client has over 40 operations - GetItem, PutItem,
Query, Scan, BatchGetItem, and so on. If you write tests against a dependency that
exposes the whole surface area, your fakes become annoying fast.
The solution is to define a minimal interface on the consumer side:
type itemGetter interface { // (1)
GetItem(context.Context, *dynamodb.GetItemInput,
...func(*dynamodb.Options)) (*dynamodb.GetItemOutput, error)
}
func GetUserByID(
ctx context.Context, client itemGetter, id string) (*User, error) {
out, err := client.GetItem(ctx, &dynamodb.GetItemInput{ // (2)
TableName: aws.String("users"),
Key: map[string]types.AttributeValue{
"pk": &types.AttributeValueMemberS{Value: id},
},
})
// ...
}
Here:
In production, pass the real DynamoDB client - it satisfies itemGetter because it has a
GetItem method. Go interfaces are satisfied implicitly:
func main() {
client := dynamodb.NewFromConfig(cfg)
user, err := GetUserByID(ctx, client, "user-123") // (1)
// ...
}
Here:
itemGetter automatically - no adapter or wrapper needed
thanks to implicit interface satisfactionIn tests, you only implement the one method you need:
type fakeItemGetter struct {
item map[string]types.AttributeValue // (1)
}
func (f *fakeItemGetter) GetItem(context.Context, *dynamodb.GetItemInput,
...func(*dynamodb.Options)) (*dynamodb.GetItemOutput, error) {
return &dynamodb.GetItemOutput{Item: f.item}, nil // (2)
}
func TestGetUserByID(t *testing.T) {
fake := &fakeItemGetter{
item: map[string]types.AttributeValue{
"email": &types.AttributeValueMemberS{Value: "[email protected]"},
},
}
user, _ := GetUserByID(t.Context(), fake, "u-1") // (3)
// assert user.Email == "[email protected]"
}
Here:
This is the Interface Segregation Principle in action - clients shouldn’t be forced to depend on methods they don’t use.
But this approach has limits. If you have 20 functions each using different DynamoDB operations, you’d end up with 20 tiny interfaces. And sometimes you’re stuck with a preexisting interface type that has more methods than you want. That’s where struct embedding helps.
Sometimes you can’t define your own minimal interface. Maybe a library insists on a specific interface type, and it’s bigger than what your test cares about.
The AWS SDK v2’s S3 upload manager is a good example. manager.NewUploader takes a client
interface that supports both single-part uploads and multipart uploads. If your test is
exercising the single-part path and you only want to intercept PutObject, implementing the
multipart methods just to satisfy the interface is pure busywork.
Go’s struct embedding provides an escape hatch. Here’s the production code:
func UploadReport(
ctx context.Context, client manager.UploadAPIClient, // (1)
bucket, key string, body io.Reader,
) error {
up := manager.NewUploader(client)
_, err := up.Upload(ctx, &s3.PutObjectInput{
Bucket: aws.String(bucket),
Key: aws.String(key),
Body: body,
})
return err
}
Here:
UploadAPIClient interface - a large interface with many methodsIn tests, embed the interface in your fake and override only what you need:
type fakeS3 struct {
manager.UploadAPIClient // (1)
gotKey string
gotBody []byte
}
func (f *fakeS3) PutObject(
_ context.Context, in *s3.PutObjectInput, _ ...func(*s3.Options),
) (*s3.PutObjectOutput, error) {
if in.Key != nil {
f.gotKey = *in.Key // (2)
}
if in.Body != nil {
f.gotBody, _ = io.ReadAll(in.Body)
}
return &s3.PutObjectOutput{}, nil
}
func TestUploadReport(t *testing.T) {
fake := &fakeS3{}
err := UploadReport(
t.Context(),
fake, // (3)
"my-bucket",
"reports/q1.csv",
bytes.NewReader([]byte("hi")), // (4)
)
if err != nil {
t.Fatal(err)
}
if fake.gotKey != "reports/q1.csv" {
t.Errorf("got %q, want %q", fake.gotKey, "reports/q1.csv")
}
}
Here:
UploadAPIClient interfacePutObject pathThe embedded interface value is nil, so any method you don’t override will panic if
called. This is a feature, not a bug. If your code accidentally triggers multipart and calls
CreateMultipartUpload, the test crashes immediately, and you learn that your test setup
(or your assumptions) are wrong.
For interfaces with a single method, there’s an even more compact approach. Say you have middleware that validates authentication tokens:
type ctxKey string
const userIDKey ctxKey = "userID"
type TokenValidator interface { // (1)
Validate(token string) (userID string, err error)
}
func RequireAuth(v TokenValidator, next http.Handler) http.Handler { // (2)
fn := func(w http.ResponseWriter, r *http.Request) {
userID, err := v.Validate(r.Header.Get("Authorization"))
if err != nil {
http.Error(w, "unauthorized", 401)
return
}
ctx := context.WithValue(r.Context(), userIDKey, userID)
next.ServeHTTP(w, r.WithContext(ctx))
}
return http.HandlerFunc(fn)
}
Here:
You could write a fake struct with a Validate method, but Go lets you define a function
type that satisfies the interface:
type TokenValidatorFunc func(string) (string, error) // (1)
func (f TokenValidatorFunc) Validate(token string) (string, error) {
return f(token) // (2)
}
Here:
This is the same pattern the standard library uses with http.HandlerFunc. Now tests can
pass inline functions:
func TestRequireAuth(t *testing.T) {
v := TokenValidatorFunc(func(token string) (string, error) {
if token == "Bearer valid" {
return "user-123", nil // (1)
}
return "", errors.New("invalid")
})
next := http.HandlerFunc(func(http.ResponseWriter, *http.Request) {})
handler := RequireAuth(v, next) // (2)
req := httptest.NewRequest("GET", "/protected", nil)
req.Header.Set("Authorization", "Bearer valid")
rec := httptest.NewRecorder()
handler.ServeHTTP(rec, req)
// assert rec.Code == http.StatusOK
}
Here:
TokenValidator interfaceNo extra struct definitions cluttering up your test file.
When your code makes HTTP requests to external services, the net/http/httptest package
provides a test server that runs on localhost. Say you have a client that fetches exchange
rates:
func (c *Client) GetRate(from, to string) (float64, error) {
url := c.baseURL + "/latest?base=" + from + "&symbols=" + to
resp, err := c.httpClient.Get(url)
if err != nil {
return 0, err
}
defer resp.Body.Close()
// decode JSON, return rate...
}
In production, c.baseURL points to the real API. Testing against it is problematic - it’s
slow, requires credentials, returns different values each time, and might rate-limit your
CI.
The httptest.Server spins up a real HTTP server on localhost:
func TestGetRate(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc( // (1)
func(w http.ResponseWriter, r *http.Request) {
fmt.Fprint(w, `{"base":"USD","rates":{"EUR":0.92}}`)
},
))
defer srv.Close() // (2)
client := NewClient(srv.URL, "key") // (3)
rate, _ := client.GetRate("USD", "EUR")
// assert rate == 0.92
}
Here:
srv.URL instead of the real APIYour code makes real HTTP calls over TCP, but they never leave the machine. You can return different responses for different scenarios - rate limits, malformed JSON, network errors - whatever you need to test.
This is essentially the same technique as Mocking a function - we’re just applying it to
time.Now. Code that depends on the current time is tricky to test:
func IsExpired(expiresAt time.Time) bool {
return time.Now().After(expiresAt)
}
Every call to time.Now() returns a different value. You can’t write a reliable test
because the result depends on when the test runs.
Make the clock injectable:
type Clock func() time.Time // (1)
func IsExpired(expiresAt time.Time, clock Clock) bool { // (2)
return clock().After(expiresAt)
}
Here:
In production, pass time.Now:
func main() {
expired := IsExpired(token.ExpiresAt, time.Now) // (1)
// ...
}
Here:
time.Now function - it satisfies the Clock typeIn tests, pass a function that returns a fixed time:
func TestIsExpired(t *testing.T) {
expiry := time.Date(2025, 6, 15, 12, 0, 0, 0, time.UTC)
before := func() time.Time { return expiry.Add(-time.Hour) } // (1)
after := func() time.Time { return expiry.Add(time.Hour) } // (2)
if IsExpired(expiry, before) {
t.Error("should not be expired")
}
if !IsExpired(expiry, after) {
t.Error("should be expired")
}
}
Here:
For code that uses time.Sleep, timers, or tickers, Go 1.25’s testing/synctest provides a
fake clock that advances automatically when goroutines in the bubble are durably blocked:
func TestPeriodicFlush(t *testing.T) {
synctest.Test(t, func(t *testing.T) { // (1)
count := 0
go func() {
ticker := time.NewTicker(10 * time.Second) // (2)
defer ticker.Stop()
for range ticker.C {
count++
if count >= 3 {
return
}
}
}()
time.Sleep(35 * time.Second) // (3)
synctest.Wait() // (4)
// assert count == 3
})
}
Here:
synctest.Test runs the function in an isolated bubble with fake time starting at
2000-01-01time.Sleep inside the bubble uses fake time; time advances when goroutines are
durably blocked, so this returns instantly after the ticker fires 3 timessynctest.Wait is a synchronization point; it blocks until the other goroutines in
the bubble are durably blocked or finishedInside synctest.Test, the framework intercepts time operations. The test completes
instantly rather than waiting for real time to pass.
These are the most common ones where I typically avoid opting for mocking libraries. But there are cases when I still like to generate mocks for an interface. One example that comes to mind is testing gRPC servers. I’m sure I’m forgetting some other cases where I regularly use mocking libraries.
The point is not to discourage the use of mocking libraries or to make a general statement that “all mocking libraries are bad.” It’s that these mocking libraries have costs associated with them. Code generation is fun, but it’s one extra step that you have to teach someone who’s onboarding to your codebase.
Also, if you’re using LLMs to generate tests, you may want to write some tests manually to give the tool a sense of how you want your tests written, so it doesn’t pull in the universe just to mock something that can be mocked natively using Go constructs.
For more on why handwritten fakes often beat generated mocks, see Test state, not interactions.
]]>A common question that shows up during a migration is, “How do we make sure the new system behaves exactly like the old one, minus the icky parts?” Another one is, “How do we build the new system while the old one keeps changing without disrupting the business?”
There’s no universal playbook. It depends on how gnarly the old system is, how ambitious the new system is, and how much risk the business can stomach. After going through a few of these migrations, I realized one approach keeps showing up. So I’ll expand on it here.
The idea is that you shadow a slice of production traffic to the new system. The old system keeps serving real users. A copy of that same traffic is forwarded to the new system along with the old system’s response. The new system runs the same business logic and compares its outputs with the old one. The entire point is to make the new system return the exact same answer the old one would have, for the same inputs and the same state.
At the start, you don’t rip out bad behavior or ship new features. Everything is about output parity. Once the systems line up and the new one has processed enough real traffic to earn some trust, you start sending actual user traffic to it. If something blows up, you roll back. If it behaves as expected, you push more traffic. Eventually the old system gets to ride off into the sunset.
This workflow is typically known as shadow testing or tap and compare testing.
Say we have a Python service with a handful of read and write endpoints the business depends on. It’s been around for a while, and different teams have patched it over the years. Some of the logic does what it does for reasons nobody remembers anymore. It still works, but it’s getting harder to maintain. Also, the business wants a tighter SLO. So the team decides to rewrite it in Go.
To keep the scope tight, I’m only talking about HTTP read and write endpoints on the main request path. The same applies to gRPC, minus the transport details. I’m ignoring everything else: message queues, background workers, async job processing, analytics pipelines, and other side channels that also need migrating.
During shadow testing, the Python service stays on the main request path. All real user traffic still goes to the Python service. A proxy or load balancer sitting in front of it forwards requests as usual, gets an answer back, and returns that answer to the user.
That same proxy also emits tap events. Each tap event contains a copy of the request and the canonical response the Python service sent to the user. Those tap events go to the Go service on a shadow path. From the outside world, nothing has changed. Clients talk to Python, and Python talks to the live production database.
The Go service never serves real users during this phase. It only sees tap events. For each event, it reconstructs the request, runs its version of the logic against a separate datastore, and compares its outputs with the Python response recorded in the event. The Python response is always the source of truth.
The Go service has its own datastore, usually a snapshot or replica of production that’s been detached so it can be written freely. This is the sister datastore. The Go service only talks to it for reads and writes. It never touches the real production DB. The sister datastore is close enough to show real-world behavior but isolated enough that nothing breaks.
With this setup in place, you spend time fixing differences. If the Python service returns a specific payload shape or some quirky value, the Go service has to match it. If Python gets a bug fix or a new feature, you update Go. You keep doing this until shadow traffic stops producing mismatches. Then you start thinking about cutover.
Reads don’t change anything in the database, so they are easier to start with.
On the main path, a user sends a request. The proxy forwards it to the Python service as usual. The Python service reads from the real database, builds a response, and returns it to the caller.
While that is happening, the proxy also constructs a tap event. At minimum, this event contains:
The proxy sends this tap event to the Go service via an internal HTTP or RPC endpoint. Alternatively, it can publish the event to a Kafka stream, where a consumer eventually forwards it to the internal tap endpoint.
The important thing is that the tap event captures the exact input and output of the Python service as seen by the real user.
A typical read path diagram during tap compare looks like this:
From the Go service’s point of view, a tap event is just structured data. A simple shape might look like this on the wire:
{
"request": {
"method": "GET",
"url": "/users/123?verbose=true",
"headers": { "...": ["..."] },
"body": "..."
},
"python_response": {
"status": 200,
"headers": { "...": ["..."] },
"body": "{ \"id\": \"123\", \"name\": \"Alice\" }"
}
}
The Go side reconstructs the request, runs its own logic against the sister datastore, and
compares its answer with python_response. No extra call back into Python. No race between
a second read and the response that already went to the user.
On the Go side, a handler for a read tap event might look like this:
type TapRequest struct {
Method string `json:"method"`
URL string `json:"url"`
Headers map[string][]string `json:"headers"`
Body []byte `json:"body"`
}
type TapResponse struct {
Status int `json:"status"`
Headers map[string][]string `json:"headers"`
Body []byte `json:"body"`
}
type TapEvent struct {
Request TapRequest `json:"request"`
PythonResponse TapResponse `json:"python_response"`
}
func TapHandleGetUser(w http.ResponseWriter, r *http.Request) {
// This endpoint is internal only.
// It receives tap events from the proxy, not real user traffic.
var tap TapEvent
if err := json.NewDecoder(r.Body).Decode(&tap); err != nil {
http.Error(w, "bad tap payload", http.StatusBadRequest)
return
}
// Rebuild something close to the original HTTP request.
reqURL, err := url.Parse(tap.Request.URL)
if err != nil {
http.Error(w, "bad url", http.StatusBadRequest)
return
}
// Body is a one-shot stream, so buffer it for reuse.
bodyBytes := append([]byte(nil), tap.Request.Body...)
goReq := &http.Request{
Method: tap.Request.Method,
URL: reqURL,
Header: http.Header(tap.Request.Headers),
Body: io.NopCloser(bytes.NewReader(bodyBytes)),
}
// Go service: run candidate logic against sister datastore.
goResp, goErr := goUserService.GetUser(r.Context(), goReq)
if goErr != nil {
log.Printf("go candidate error: %v", goErr)
}
// Normalize and compare off the main response path.
// The real user already got python_response.
go func() {
normalizedPython := normalizeHTTP(tap.PythonResponse)
normalizedGo := normalizeHTTP(goResp)
if !deepEqual(normalizedPython, normalizedGo) {
log.Printf(
"read mismatch: url=%s python=%v go=%v",
tap.Request.URL,
normalizedPython,
normalizedGo,
)
}
}()
// Optional debugging response for whoever is calling the tap
// endpoint.
w.WriteHeader(http.StatusNoContent)
}
A few things to notice:
When the read diffs drop to zero (or near zero) against live traffic, you can trust the Go implementation matches the Python one.
Write endpoints change state, so they are harder to migrate.
On the main path, only the Python service is allowed to mutate production state.
A typical write looks like this on the main path:
That path is the only one touching production. The Go service must not:
For writes, the tap event pushed by the proxy looks quite similar to reads:
{
"request": {
"method": "POST",
"url": "/users",
"headers": { "...": ["..."] },
"body": "{ \"email\": \"[email protected]\", \"name\": \"Alice\" }"
},
"python_response": {
"status": 201,
"headers": { "...": ["..."] },
"body": "{ \"id\": \"123\", \"email\": \"[email protected]\" }"
}
}
The write path diagram during tap compare becomes:
On the Go side, the write tap handler follows the same pattern as reads but has more corner cases to think through.
A shadow write handler might look like this:
type UserInput struct {
Email string `json:"email"`
Name string `json:"name"`
// ... other fields
}
type User struct {
ID string `json:"id"`
Email string `json:"email"`
Name string `json:"name"`
CreatedAt time.Time `json:"created_at"`
// ... other fields
}
func TapHandleCreateUser(w http.ResponseWriter, r *http.Request) {
// Internal only. Receives tap events for CreateUser.
var tap TapEvent
if err := json.NewDecoder(r.Body).Decode(&tap); err != nil {
http.Error(w, "bad tap payload", http.StatusBadRequest)
return
}
// Decode the original request body once.
var input UserInput
if err := json.Unmarshal(tap.Request.Body, &input); err != nil {
log.Printf("bad original json: %v", err)
return
}
// The Python response is canonical: this is what the real user saw.
pyUser, err := decodePythonUser(tap.PythonResponse)
if err != nil {
log.Printf("bad python response: %v", err)
return
}
// Run the Go write path against the sister datastore.
// This must never talk to the live production DB.
goUser, goErr := goUserService.CreateUserInSisterStore(
r.Context(), input,
)
if goErr != nil {
log.Printf("go candidate write error: %v", goErr)
}
// Compare results asynchronously.
go func() {
normalizedPython := normalizeUser(pyUser)
normalizedGo := normalizeUser(goUser)
if !compareUsers(normalizedPython, normalizedGo) {
log.Printf(
"write mismatch: email=%s python=%v go=%v",
normalizedPython.Email,
normalizedPython,
normalizedGo,
)
}
}()
w.WriteHeader(http.StatusNoContent)
}
You are comparing how each system transforms the same request into a domain object and response. You are not trying to drive the Python service a second time. You are not trying to rebuild the Python result from scratch against changed state.
But with this setup, the write path has several corner cases to think through.
Uniqueness checks, conditional updates, and other validations that depend on database state are sensitive to timing. The Python write runs against the actual production state at the moment the main request hits. The Go write runs against whatever state exists in the sister datastore when the tap event arrives.
If the sister datastore is a snapshot that is not continuously replicated, it will drift almost immediately. Even with streaming replication, there may be short lags. That means:
You should expect some write comparisons to be noisy because of state drift and treat those separately. In practice you often:
The important thing is: when you see a mismatch, you can decide whether it is a real logic difference or just the sister store living in a slightly different universe for that request.
Real systems don’t get one clean write per user action. You get retries, duplicates, and concurrent updates.
On the main path, you might have:
Your Python service probably already has a story for this, such as idempotency keys, version checks, or last-write-wins semantics. The tap path needs to reflect what actually happened, not an idealized story.
Because the tap event is constructed from the real request and real response at the proxy, it naturally honors whatever the Python service did. If a retry was coalesced into a single write under an idempotency key, you will see a single successful response in the tap stream. If the second retry was rejected as a conflict, you will see that error. The Go service just needs to implement the same semantics against the sister datastore.
What still bites you is ordering. Tap events may arrive at the Go service a little out of order relative to how mutations hit production. If two writes race, Python might process them in order A, B while the tap messages arrive as B, A. The sister datastore will then experience a different sequence of state changes than production did, which can yield legitimate differences in final state.
You can’t fully eliminate this. What you can do is:
CreateUser behave the same)
than on multi-request history until you are comfortable with the noise.Writes often have external side effects: emails, payment gateways, cache invalidations, search indexing, analytics.
The tap path isolates database writes by using the sister datastore, but that is not enough on its own. You have to run the Go service in a mode where those side-effectful calls are either disabled or mocked.
The usual pattern is:
You want the code paths that decide “should we send a welcome email” or “should we charge this card” to run, because they influence the domain model and response shape. You don’t want the actual email to go out or the real payment provider to be hit twice.
On the Python side, you don’t need dry runs or special write endpoints. The real main path already did the work, and the tap event gives you the results. The only thing the Python service might need for tap compare is a dedicated read endpoint that returns a normalized view of state if you want to sample post-write state directly. That read endpoint must not cause extra writes or side effects.
It tells you:
It doesn’t guarantee:
The right way to think about it is: tap compare lets you align the new system with the old one for the traffic you actually have, under the state and timing conditions you actually experienced. It shrinks the unknowns before you put the new system in front of real users.
The Tap* handlers are test-only. They will never be promoted to production. They exist to
validate the domain logic, not to serve users. The 204 No Content response makes this
clear.
Here’s how the pieces fit together:
goUserService that take a context and input, return a
response. This is the code you’re actually testing.Both tap and production handlers call the same domain logic. The difference is what happens to the result. Tap handlers compare and throw away. Production handlers serialize and return.
A production handler might look like this:
func HandleGetUser(w http.ResponseWriter, r *http.Request) {
resp, err := goUserService.GetUser(r.Context(), r)
if err != nil {
writeError(w, err)
return
}
writeHTTP(w, resp)
}
During tap compare, TapHandleGetUser feeds the same inputs into goUserService.GetUser
and compares resp against the Python response. Meanwhile, HandleGetUser exists but isn’t
on the main path yet. It might serve staging traffic or a canary behind a flag.
Once the diffs drop to zero, you have evidence goUserService.GetUser works correctly. At
that point, you route real traffic to HandleGetUser. The domain logic has already been
validated. The production handler just wires it to HTTP.
Once the production handlers have started to serve real traffic, you can remvove the tap tests:
Tap* prefix makes them easy to find.HandleGetUser and HandleCreateUser.Tap compare is scaffolding. Once you trust the domain logic, you throw it away and let the production handlers take over.
A few things worth calling out beyond what the write section already covers:
10.0 vs 10 doesn’t matter. Normalize or ignore these fields.Typically, you don’t have to build all the plumbing by hand. Proxies like Envoy, NGINX, and HAProxy, or a service mesh like Istio, can help you mirror traffic, capture tap events, and feed them into a shadow service. I left out tool-specific workflows so that the core concept doesn’t get obscured.
Tap compare doesn’t remove all the risk from a migration, but it moves a lot of it into a
place you can see: mismatched payloads, noisy writes, and gaps in business logic. Once those
are understood, switching over to the new service is less of a big bang and more of a boring
configuration change, followed by trimming a pile of Tap* code you no longer need.
A man with a watch knows what time it is. A man with two watches is never sure.
Take this example:
func validate(input string) (bool, error) {
// Validation check 1
if input == "" {
return false, nil
}
// Validation check 2
if isCorrupted(input) {
return false, nil
}
// System check
if err := checkUpstream(); err != nil {
return false, err
}
return true, nil
}
This function returns two signals: a boolean to indicate if the string is valid, and an error to explain any problem the function might run into.
The issue is that these two signals are independent. Put together, they produce four possible combinations:
true, nil: The input is valid and the function encountered no issues. This is the only
obvious mode.false, nil: Implies the function didn’t hit a system error but the input was invalid.
However, in many codebases, this combination is accidentally used to hide real errors
that were swallowed.true, err: A contradiction. The function claims success and failure at the same time.false, err: Looks like a clean failure, but it creates a priority trap. The Go
convention dictates you must check the error first. If a caller checks the boolean
first, they might see false and treat a major system crash as a simple validation
failure.In this specific case, we never return true, err, but the caller doesn’t know that. They
have to read the code to understand which subset of the possible combinations the function
actually uses.
For lack of a better term, I call this splintered failure modes. It is one of the cases that the adage make illegal state unrepresentable aims to prevent.
In our case, validate encodes the success/failure state in two places. These two signals
can disagree. The boolean tries to express validity, and the error tries to express system
failure, yet both attempt to answer the same question: did this succeed?
When combinations like false, nil or true, err appear, the caller needs to know how to
reconcile the conflicting states.
We fix the ambiguity by removing the boolean status flag entirely.
In this refactored version, the error assumes total responsibility for the function’s
state (success vs. failure). The first return value becomes purely the payload.
The caller checks one place and one place only: the error.
// We return the data (string), not a flag (bool)
func validate(input string) (string, error) {
if input == "" {
return "", fmt.Errorf("input cannot be empty")
}
if isCorrupted(input) {
return "", fmt.Errorf("input is corrupted")
}
if err := checkUpstream(); err != nil {
return "", err
}
// If we are here, the data is valid
return input, nil
}
This makes the call site trivial because the state is no longer split. If the error is non-nil, the operation failed. If it is nil, the operation succeeded.
Sometimes the caller of a function needs to take different actions depending on the type of an error. In that case, just knowing whether a function succeeded or failed isn’t enough.
Removing the boolean removes the ambiguity, but it introduces a new question: How do we distinguish between “validation error” and “system failure”?
Previously, the boolean represented validation outcome (valid/invalid), and the error
represented the system failures (crash/upstream). Now that we have consolidated everything
into error, we need a way to differentiate the kind of failure without re-introducing a
second return value.
We can use sentinel errors to encode multiple failure modes into one error variable. The
error return value remains the single source of truth for “did it fail?”, but the
content of that error tells us “how it failed.”
var (
// Domain/Logic failures
ErrEmpty = errors.New("input cannot be empty")
ErrCorrupted = errors.New("input is corrupted")
// System/Mechanical failures
ErrSystem = errors.New("system failure")
)
func validate(input string) (string, error) {
if input == "" {
return "", ErrEmpty
}
if isCorrupted(input) {
return "", ErrCorrupted
}
if err := checkUpstream(); err != nil {
// We could return err directly, or wrap it
return "", ErrSystem
}
return input, nil
}
We have unified the failure state (it is always just an error), but we haven’t lost the
granularity. The caller can now use errors.Is to switch between the failure modes:
val, err := validate(userData)
if err != nil {
switch {
case errors.Is(err, ErrEmpty):
// Handle logic failure 1 (e.g. prompt user)
return
case errors.Is(err, ErrCorrupted):
// Handle logic failure 2 (e.g. reject payload)
return
case errors.Is(err, ErrSystem):
// Handle system failure (e.g. alert ops team)
log.Fatal(err)
default:
log.Fatal(err)
}
}
If sentinels aren’t enough (for example, if you need to know which field failed
validation), you can use error types. This allows the single error value to carry
structured metadata while still adhering to the standard error interface.
Here, we map both “Empty” and “Corrupted” to a ValidationError type, while leaving system
errors as standard errors.
type ValidationError struct {
Field string
Reason string
}
func (e *ValidationError) Error() string {
return fmt.Sprintf("invalid %s: %s", e.Field, e.Reason)
}
func validate(input string) (string, error) {
if input == "" {
return "", &ValidationError{Field: "input", Reason: "empty"}
}
if isCorrupted(input) {
return "", &ValidationError{Field: "input", Reason: "corrupted"}
}
if err := checkUpstream(); err != nil {
return "", err
}
return input, nil
}
The caller can then use errors.As to inspect the failure mode in detail:
val, err := validate(userData)
if err != nil {
var vErr *ValidationError
// Check if the error is a logical ValidationError
if errors.As(err, &vErr) {
fmt.Printf("Validation failed on %s: %s", vErr.Field, vErr.Reason)
return
}
// If not, it is a system failure
log.Fatal(err)
}
By sticking to the error value as the single indicator of failure, we eliminate the “two
watches” paradox. Whether the failure is a simple validation error or a catastrophic system
crash, all the failure modes are encapsulated inside the single error value itself.
Run the real command. It invokes the actual binary that creates the subprocess and asserts against the output. However, that makes tests slow and tied to the environment. You have to make sure the same binary exists and behaves the same everywhere, which is harder than it sounds.
Fake it. Mock the subprocess to keep tests fast and isolated. The problem is that the fake version doesn’t behave like a real process. It won’t fail, write to stderr, or exit with a non-zero code. That makes it hard to trust the result, and over time the mock can drift away from what the real command actually does.
Re-exec. I discovered this neat trick while watching Mitchel Hashimoto’s Advanced Testing with Go talk. In fact, it originated in the stdlib os/exec test suite. With re-exec, your test binary spawns a new subprocess that runs itself again. Inside that subprocess, the code emulates the behavior of the real command. The parent process then interacts with this subprocess exactly as it would with a real command. In short:
This setup makes re-exec a middle ground between mocking and invoking the actual subprocess.
The first two paths are well-trodden, so let’s look closer at the third one. Here’s how it works:
You still get a real subprocess, but the behavior of your original binary invocation is emulated inside it. So you don’t invoke the original command. Observe:
// /cmd/echo/main.go
package main
import (
"os/exec"
)
// RunEcho executes the system "echo" command with the provided message
// and returns the command's output.
func RunEcho(msg string) (string, error) {
cmd := exec.Command("echo", msg)
out, err := cmd.Output()
return string(out), err
}
RunEcho invokes the system’s echo binary with some argument and returns the output. Now
let’s test it using the re-exec trick:
// /cmd/echo/main_test.go
package main
import (
"fmt"
"os"
"os/exec"
"testing"
)
// TestEchoHelper runs when the binary is re-executed with
// GO_WANT_HELPER_PROCESS=1. It prints its argument and exits,
// emulating "echo".
func TestEchoHelper(t *testing.T) {
if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" {
return
}
fmt.Print(os.Args[len(os.Args)-1])
os.Exit(0)
}
func TestRunEcho(t *testing.T) {
// Spawn the same test binary as a subprocess instead of calling the
// real "echo". This runs only the TestEchoHelper test in a subprocess
// which emulates the behavior of "echo"
cmd := exec.Command(
os.Args[0],
"-test.run=TestEchoHelper",
"--",
"hello",
)
cmd.Env = append(os.Environ(), "GO_WANT_HELPER_PROCESS=1")
out, err := cmd.Output()
if err != nil {
t.Fatal(err)
}
if string(out) != "hello" {
t.Fatalf("got %q, want %q", out, "hello")
}
}
TestRunEcho creates a command that re-runs the same test binary (os.Args[0]) as a
subprocess via the exec.Command. The -test.run=TestEchoHelper flag tells Go’s test
runner to execute only the TestEchoHelper function inside that new process. The "--"
marks the end of the test runner’s own flags, and everything after it ("hello") becomes an
argument available to the helper process in os.Args.
When this subprocess starts, it sees that the environment variable
GO_WANT_HELPER_PROCESS=1 is set. That tells it to behave like a helper instead of running
the full test suite. The TestEchoHelper function then prints its last argument ("hello")
to standard output and exits. In other words, we’re emulating echo inside
TestEchoHelper. This part is intentionally kept simple, but you can do all kinds of things
here to emulate the actual echo command. In real tests, this will also include different
failure modes.
From the parent process’s perspective, it looks just like running /bin/echo hello, except
everything is happening within the Go test binary. The subprocess is real, but its behavior
is entirely controlled by the test.
You might find it strange that the actual RunEcho function isn’t called anywhere. That’s
on purpose. The goal of this example is not to test production logic, but to show how to
emulate and control subprocesses inside a test environment. The production function here
doesn’t contain any logic beyond calling exec.Command, so there’s nothing meaningful to
verify yet.
In real code, typically, you’d split subprocess management into two parts: one that spawns the process and another that handles its output and errors. The handler is where the bulk of your logic should live. This way, the subprocess handling code can be tested in isolation without having to tie it with a real subprocess.
Consider this example where the production code invokes the git switch mybranch command.
The RunGitSwitch command calls the git binary with the appropriate arguments and passes
the *exec.Cmd pointer to the handleGitSwitch function. This handler function has the
bulk of the logic that interacts with the git subprocess.
// path: /cmd/git/main.go
package main
import (
"os/exec"
)
// handleGitSwitch runs a command and returns its output and error.
func handleGitSwitch(cmd *exec.Cmd) (string, error) {
out, err := cmd.CombinedOutput()
return string(out), err
}
// RunGitSwitch constructs the subprocess to run "git switch".
func RunGitSwitch(branch string) (string, error) {
cmd := exec.Command("git", "switch", branch)
return handleGitSwitch(cmd)
}
And the corresponding test:
// path: /cmd/git/main_test.go
package main
import (
"fmt"
"os"
"os/exec"
"testing"
)
// TestGitSwitchHelper acts as the fake "git switch" subprocess.
func TestGitSwitchHelper(t *testing.T) {
if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" {
return
}
// Emulate "git switch" output.
fmt.Printf("Switched to branch '%s'\n", os.Args[len(os.Args)-1])
os.Exit(0)
}
func TestGitSwitch(t *testing.T) {
cmd := exec.Command(
os.Args[0],
"-test.run=TestGitSwitchHelper", "--", "feature-branch",
)
cmd.Env = append(os.Environ(), "GO_WANT_HELPER_PROCESS=1")
// This time we're invoking the production handler.
out, err := handleGitSwitch(cmd)
if err != nil {
t.Fatal(err)
}
want := "Switched to branch 'feature-branch'\n"
if out != want {
t.Fatalf("got %q, want %q", out, want)
}
}
In this test, the subprocess behavior (git switch) is emulated by TestGitSwitchHelper.
The helper prints predictable output that mimics the real command, but the subprocess itself
is still a separate process spawned by the parent test.
What’s under test here is handleGitSwitch, which manages subprocess execution, reads its
output, and handles errors. The subprocess is fake in behavior but real in execution, which
means the I/O boundaries are still exercised.
This separation between subprocess creation and handling keeps tests focused and repeatable. You can emulate different subprocess outcomes, such as errors or unexpected output, while keeping the process interaction logic untouched.
]]>Still, I’ve found that principles like SOLID, despite their OO origin, can be useful guides when thinking about design in Go.
Recently, while chatting with a few colleagues new to Go, I noticed that some of them had spontaneously rediscovered the Interface Segregation Principle (the “I” in SOLID) without even realizing it. The benefits were obvious, but without a shared vocabulary, it was harder to talk about and generalize the idea.
So I wanted to revisit ISP in the context of Go and show how small interfaces, implicit implementation, and consumer-defined contracts make interface segregation feel natural and lead to code that’s easier to test and maintain.
Clients should not be forced to depend on methods they do not use.
– Robert C. Martin (SOLID, interface segregation principle)
Or, put simply: your code shouldn’t accept anything it doesn’t use.
Consider this example:
type FileStorage struct{}
func (FileStorage) Save(data []byte) error {
fmt.Println("Saving data to disk...")
return nil
}
func (FileStorage) Load(id string) ([]byte, error) {
fmt.Println("Loading data from disk...")
return []byte("data"), nil
}
FileStorage has two methods: Save and Load. Now suppose you write a function that only
needs to save data:
func Backup(fs FileStorage, data []byte) error {
return fs.Save(data)
}
This works, but there are a few problems hiding here.
Backup takes a FileStorage directly, so it only works with that type. If you later want
to back up to memory, a network location, or an encrypted store, you’ll need to rewrite the
function. Because it depends on a concrete type, your tests have to use FileStorage too,
which might involve disk I/O or other side effects you don’t want in unit tests. And from
the function signature, it’s not obvious what part of FileStorage the function actually
uses.
Instead of depending on a specific type, we can depend on an abstraction. In Go, you can achieve that through an interface. So let’s define one:
type Storage interface {
Save(data []byte) error
Load(id string) ([]byte, error)
}
Now Backup can take a Storage instead:
func Backup(store Storage, data []byte) error {
return store.Save(data)
}
Backup now depends on behavior, not implementation. You can plug in anything that
satisfies Storage, something that writes to disk, memory, or even a remote service. And
FileStorage still works without any change.
You can also test it with a fake:
type FakeStorage struct{}
func (FakeStorage) Save(data []byte) error { return nil }
func (FakeStorage) Load(id string) ([]byte, error) { return nil, nil }
func TestBackup(t *testing.T) {
fake := FakeStorage{}
err := Backup(fake, []byte("test-data"))
if err != nil {
t.Fatal(err)
}
}
That’s a step forward. It fixes the coupling issue and makes the tests free of side effects.
However, there’s still one issue: Backup only calls Save, yet the Storage interface
includes both Save and Load. If Storage later gains more methods, every fake must grow
too, even if those methods aren’t used. That’s exactly what the ISP warns against.
The above interface is too broad. So let’s narrow it to match what the function actually needs:
type Saver interface {
Save(data []byte) error
}
Then update the function:
func Backup(s Saver, data []byte) error {
return s.Save(data)
}
Now the intent is clear. Backup only depends on Save. A test double can just implement
that one method:
type FakeSaver struct{}
func (FakeSaver) Save(data []byte) error { return nil }
func TestBackup(t *testing.T) {
fake := FakeSaver{}
err := Backup(fake, []byte("test-data"))
if err != nil {
t.Fatal(err)
}
}
The original FileStorage still works fine:
fs := FileStorage{}
_ = Backup(fs, []byte("backup-data"))
Go’s implicit interface satisfaction makes this less ceremonious. Any type with a Save
method automatically satisfies Saver.
This pattern reflects a broader Go convention: define small interfaces on the consumer side, close to the code that uses them. The consumer knows what subset of behavior it needs and can define a minimal contract for it. If you define the interface on the producer side instead, every consumer is forced to depend on that definition. A single change to the producer’s interface can ripple across your codebase unnecessarily.
From Go code review comments:
Go interfaces generally belong in the package that uses values of the interface type, not the package that implements those values. The implementing package should return concrete (usually pointer or struct) types: that way, new methods can be added to implementations without requiring extensive refactoring.
This isn’t a strict rule. The standard library defines producer-side interfaces like
io.Reader and io.Writer, which is fine because they’re stable and general-purpose. But
for application code, interfaces usually exist in only two places: production code and
tests. Keeping them near the consumer reduces coupling between multiple packages and keeps
the code easier to evolve.
You’ll see this same idea pop up all the time. Take the AWS SDK, for example. It’s tempting to define a big S3 client interface and use it everywhere:
type S3Client interface {
PutObject(
ctx context.Context,
input *s3.PutObjectInput,
opts ...func(*s3.Options)) (*s3.PutObjectOutput, error)
GetObject(
ctx context.Context,
input *s3.GetObjectInput,
opts ...func(*s3.Options)) (*s3.GetObjectOutput, error)
ListObjectsV2(
ctx context.Context,
input *s3.ListObjectsV2Input,
opts ...func(*s3.Options)) (*s3.ListObjectsV2Output, error)
// ...and many more
}
Depending on such a large interface couples your code to far more than it uses. Any change or addition to this interface can ripple through your code and tests for no good reason.
For example, if your code uploads files, it only needs the PutObject method:
func UploadReport(
ctx context.Context, client S3Client, data []byte,
) error {
_, err := client.PutObject(
ctx,
&s3.PutObjectInput{
Bucket: aws.String("reports"),
Key: aws.String("daily.csv"),
Body: bytes.NewReader(data),
},
)
return err
}
But accepting the full S3Client here ties UploadReport to an interface that’s too broad.
A fake must implement all the methods just to satisfy it.
It’s better to define a small, consumer-side interface that captures only the operations you need. This is exactly what the AWS SDK doc recommends for testing.
To support mocking, use Go interfaces instead of concrete service client, paginators, and waiter types, such as s3.Client. This allows your application to use patterns like dependency injection to test your application logic.
Similar to what we’ve seen before, you can define a single method interface:
type Uploader interface {
PutObject(
ctx context.Context,
input *s3.PutObjectInput,
opts ...func(*s3.Options)) (*s3.PutObjectOutput, error)
}
And then use it in the function:
func UploadReport(ctx context.Context, u Uploader, data []byte) error {
_, err := u.PutObject(
ctx,
&s3.PutObjectInput{
Bucket: aws.String("reports"),
Key: aws.String("daily.csv"),
Body: bytes.NewReader(data),
},
)
return err
}
The intent is obvious: this function uploads data and depends only on PutObject. The fake
for tests is now tiny:
type FakeUploader struct{}
func (FakeUploader) PutObject(
_ context.Context,
_ *s3.PutObjectInput,
_ ...func(*s3.Options)) (*s3.PutObjectOutput, error) {
return &s3.PutObjectOutput{}, nil
}
If we distill the workflow as a general rule of thumb, it’d look like this:
Insert a seam between two tightly coupled components by placing a consumer-side interface that exposes only the methods the caller invokes.
Fin!
]]>context package can also
carry request-scoped values across API boundaries and processes.
There are only two public API constructs associated with context values:
func WithValue(parent Context, key, val any) Context
func (c Context) Value(key any) any
WithValue can take any comparable value as both the key and the value. The key defines how
the stored value is identified, and the value can be any data you want to pass through the
call chain.
Value, on the other hand, also returns any, which means the compiler cannot infer the
concrete type at compile time. To use the returned data safely, you must perform a type
assertion.
A naive workflow to store and retrieve values in a context looks like this:
ctx := context.Background()
// Store some value against a key
ctx = context.WithValue(ctx, "userID", 42)
// Retrieve the value
v := ctx.Value("userID")
// Value returns any, so you need a type assertion
id, ok := v.(int)
if !ok {
fmt.Println("unexpected type")
}
fmt.Println(id) // 42
WithValue returns a new context that wraps the parent. Value walks up the chain of
contexts and returns the first matching key it finds. Since the return type is any, a type
assertion is required to recover the original type. Without the ok check, a mismatch would
cause a panic.
The issue with this setup is that it risks collision. If another package sets a value against the same key, one overwrites the other:
package main
import (
"context"
"fmt"
)
func main() {
ctx := context.WithValue(context.Background(), "key", "from-main")
ctx = foo(ctx)
fmt.Println(ctx.Value("key")) // from-foo
}
func foo(ctx context.Context) context.Context {
// Accidentally reuse the same key in another package
return context.WithValue(ctx, "key", "from-foo")
}
The first value becomes inaccessible because WithValue returns a new derived context that
shadows parent values with the same key. The original value still exists in the parent
context but is unreachable through the reassigned variable.
To understand why this collision occurs, you need to know how Go compares interface values.
When you assign a value to an interface{} (or any), Go boxes that value into an internal
representation made up of two machine words: one points to the type information, and the
other points to the underlying data.
For example:
var a any = "key"
var b any = "key"
fmt.Println(a == b) // true
Each boxed interface here stores two things: a pointer to the type string and a pointer to
the data "key". Since both type and data pointers match, the comparison returns true.
WithValue stores both the key and the value as any. When you later call Value, Go
compares the boxed key you pass in with those stored in the context chain. If two different
packages use the same built-in key type and data, like both passing "key" as a string,
their boxed representations look identical. Go sees them as equal, and the most recent value
shadows the earlier one.
If you want to learn more about how interfaces are represented and compared, Russ Cox’s post on Go interface internals explains it in detail with pretty pictures.
The fix is to make sure the keys have unique types so their boxed representations differ. If you define a custom type, the type pointer changes even if the data looks the same. For example:
type userKey string
var a any = userKey("key")
var b any = "key"
fmt.Println(a == b) // false
Even though the underlying value is "key", the two interfaces now hold different type
information, so Go considers them unequal. That difference in type identity is what prevents
collisions.
The context documentation gives this advice:
The provided key must be comparable and should not be of type string or any other built-in type to avoid collisions between packages using context. Users of WithValue should define their own types for keys. To avoid allocating when assigning to an interface{}, context keys often have concrete type struct{}. Alternatively, exported context key variables' static type should be a pointer or interface.
In short:
string, int, struct, pointer, etc.)struct{} keys to avoid allocation when stored as anyHere’s how defining a unique key type prevents collisions:
type userIDKey string
// Store value
ctx := context.WithValue(context.Background(), userIDKey("id"), 42)
// Retrieve value
id := ctx.Value(userIDKey("id"))
fmt.Println(id) // 42
Even if another package uses the string "id", the key types differ, so they cannot
collide.
To avoid allocation when WithValue assigns the inbound value to interface any, you can
define an empty struct key. Unlike strings or integers, which allocate when boxed into an
interface, a zero-sized struct occupies no memory and needs no allocation:
type key struct{}
// Store value
ctx := context.WithValue(context.Background(), key{}, "value")
// Retrieve value
v := ctx.Value(key{})
fmt.Println(v) // value
Empty structs are ideal for local, unexported keys. They are unique by type and add no overhead.
Alternatively, exported keys can use pointers, which also avoid allocation and guarantee uniqueness. When a pointer is boxed into an interface, no data copy occurs because the interface just holds the pointer reference. Pointers are also ideal for keys that need to be shared across packages.
type userIDKey struct {
name string
}
// Struct pointer as key
var UserIDKey = &userIDKey{"user-id"}
// Store value. No allocation here since userIDKey is a pointer
// to a struct
ctx := context.WithValue(context.Background(), UserIDKey, 42)
// Retrieve value
id := ctx.Value(UserIDKey)
fmt.Println(id) // 42
Here, UserIDKey points to a unique struct instance, so equality checks work by pointer
identity. The name field exists only for debugging. This avoids allocation and ensures
exported keys remain unique even when shared between packages.
When exposing context values across APIs, you can approach it in two ways depending on how much control and safety you want to give your users.
You can export the key itself and let users interact with it freely:
type APIKey string
// Allow the other packages to directly use this key
var APIKeyContextKey = APIKey("api-key")
// Store value. An allocation will occur since the key is of type string
ctx := context.WithValue(context.Background(), APIKeyContextKey, "secret")
// Retrieve value
v := ctx.Value(APIKeyContextKey).(string) // caller must do this assertion
fmt.Println(v) // secret
When you export the key directly the caller gains direct access, but they also must:
The net/http package uses this approach for some of its exported context keys:
type contextKey struct {
name string
}
// Notice the exported keys
var (
ServerContextKey = &contextKey{"http-server"}
LocalAddrContextKey = &contextKey{"local-addr"}
)
Each variable points to a distinct struct, making them unique by pointer identity.
The serve_test.go file uses these keys like this:
ctx := context.WithValue(
context.Background(), http.ServerContextKey, srv,
)
// Type assertion to recover the concrete type
srv2, ok := ctx.Value(http.ServerContextKey).(*http.Server)
if ok {
fmt.Println(srv == srv2) // true
}
The server value is stored in the context and later retrieved using the same pointer key. The user must perform a type assertion and handle it safely.
The other approach is to hide the key and provide accessor functions to set and retrieve values. This removes the need for users to remember the right key type or perform type assertions manually.
// Define a private key type to avoid collisions
type contextKey struct {
name string
}
// Define the key
var userIDKey = &contextKey{"user-id"}
// Public accessor to store a value to ctx
func WithUserID(ctx context.Context, id int) context.Context {
// No allocation here since userIDKey is a pointer to a struct
return context.WithValue(ctx, userIDKey, id)
}
// Public accessor to fetch a value from ctx
func UserIDFromContext(ctx context.Context) (int, bool) {
v, ok := ctx.Value(userIDKey).(int)
return v, ok
}
// Store value
ctx := WithUserID(context.Background(), 42)
// Retrieve value
id, ok := UserIDFromContext(ctx)
if ok {
fmt.Println(id) // 42
} else {
fmt.Println("no user ID found in context")
}
This approach centralizes how values are stored and retrieved from the context. It ensures the correct key and type are always used, preventing collisions and runtime panics. It also keeps the calling code shorter since your API users won’t need to repeat type assertions everywhere.
WithX / XFromContext accessors appear throughout the Go standard library:
func WithClientTrace(
ctx context.Context, trace *ClientTrace,
) context.Context
func ContextClientTrace(ctx context.Context) *ClientTrace
func WithLabels(ctx context.Context, labels LabelSet) context.Context
func Labels(ctx context.Context) LabelSet
You can find similar examples outside of the stdlib. For instance, the OpenTelemetry Go SDK follows the same model:
func ContextWithSpan(parent context.Context, span Span) context.Context
func SpanFromContext(ctx context.Context) Span
This technique standardizes how values are passed across APIs, eliminates redundant type assertions, and prevents key misuse across packages.
I usually use a pointer to a struct as a key and expose accessor functions when building user-facing APIs. Otherwise, in services, I often define empty struct keys and expose them publicly to avoid the ceremony around accessor functions.
]]>testing library only gives you a few
options. I think that’s a great thing because there are fewer details to remember and fewer
things to onboard people to. However, during code reviews, I often see people contravene a
few common conventions around test organization, especially those who are new to the
language.
If we distill the most common questions that come up when organizing tests, they are:
To answer these, let’s consider a simple test subject.
Let’s define a small app called myapp that contains a single package mypkg. It has a
Greet function that returns a greeting message as a string. We’ll use this throughout the
discussion and evolve the directory structure as needed.
myapp/
└── mypkg/
├── greet.go
└── greet_test.go
Here’s how greet.go looks:
// greet.go
package mypkg
func Greet(name string) string {
if name == "" {
return "Hello, stranger"
}
return "Hello, " + name
}
Most Go tests live next to the code they verify. These are called in-package tests, and they share the same package name as the code under test. This setup gives them access to unexported functions and variables, making them ideal for unit tests that target specific internal logic.
// greet_test.go
package mypkg // The test file lives under `mypkg`
import "testing"
func TestGreet(t *testing.T) {
got := Greet("Go") // The test can access mypkg deps without an import
want := "Hello, Go"
if got != want {
t.Fatalf("Greet() = %q, want %q", got, want)
}
}
The structure stays the same:
myapp/
└── mypkg/
├── greet.go # under package mypkg
└── greet_test.go # under package mypkg
These are your bread-and-butter unit tests. You can run them with go test ./..., and
they’ll have full access to unexported details in the package.
The Go documentation explains it as:
The test file can be in the same package as the one being tested. If the test file is in the same package, it may refer to unexported identifiers within the package.
This approach is called white-box testing. Your test code has full access to the package
internals, allowing you to test them directly when needed. For example, if there’s an
unexported function in greet.go, the test in greet_test.go can call it directly.
Following the test pyramid, most tests in your system should be written this way.
Sometimes you want to verify that your package behaves correctly from the outside. At this point, you’re not concerned with its internals and just want to confirm that the public API works as intended.
Go makes this possible by letting you write tests under a package name that ends with
_test. This creates a separate test package that lives alongside the package under test.
For example:
// greet_external_test.go
package mypkg_test // Note the package definition
import (
"testing"
"myapp/mypkg" // Explicitly import the SUT package
)
func TestGreetExternal(t *testing.T) {
got := mypkg.Greet("External")
want := "Hello, External"
if got != want {
t.Fatalf("unexpected output: got %q, want %q", got, want)
}
}
Your directory now includes both internal and external tests:
myapp/
└── mypkg/
├── greet.go # under package mypkg
├── greet_test.go # under package mypkg
└── greet_external_test.go # under package mypkg_test
In this setup, the mypkg directory can only contain the mypkg and mypkg_test packages.
The compiler recognizes the _test suffix and disallows any other package names in the same
directory.
A key detail is that the Go test harness doesn’t build the tests of mypkg_test together
with those of mypkg. It compiles two separate test binaries: one containing the package
code and its in-package tests, and another containing the external tests. Each binary runs
independently, and the external one links against the compiled mypkg archive just like any
other importing package. You can find more about this process in the Go documentation on
how tests are run.
This structure is particularly useful for validating public contracts and ensuring that refactors don’t break exported APIs.
As noted in the official testing package docs:
If the file is in a separate
_testpackage, the package being tested must be imported explicitly, and only its exported identifiers may be used. This is known as “black-box" testing.
It’s a neat way to test your package from the outside without moving your tests into a separate directory tree. You can find examples of this style in net/http, context, and errors.
Go’s testing tool treats examples, benchmarks, and fuzz tests as first-class test functions.
They use the same go test command as your regular unit tests and usually live in the same
package. This makes them part of the same discovery and execution process but with different
entry points.
Here’s how all three can coexist in the same package:
// greet_test.go
package mypkg // same package as the unit tests
import (
"fmt"
"testing"
)
// ... other unit tests
func ExampleGreet() {
fmt.Println(Greet("Alice"))
// Output: Hello, Alice
}
func BenchmarkGreet(b *testing.B) {
for b.Loop() {
Greet("Go")
}
}
func FuzzGreet(f *testing.F) {
f.Add("Bob")
f.Fuzz(func(t *testing.T, name string) {
Greet(name)
})
}
This setup doesn’t change your layout:
myapp/
└── mypkg/
├── greet.go # under package mypkg
└── greet_test.go # under package mypkg
If you prefer to separate these test types, you can move them into their own file while keeping them in the same package:
myapp/
└── mypkg/
├── greet.go # under package mypkg
├── greet_test.go # under package mypkg
└── greet_bench_fuzz_example.go # under package mypkg
In this layout, greet_bench_fuzz_example.go houses the benchmarks, fuzz tests, and
examples, but all files still declare the same package mypkg. These are regular unit tests
with specialized entry points. See how packages like encoding/json or html organize
their fuzz tests.
It’s not a strict rule to keep them in the same package. You can also put them in a _test
package. The sort package, for example, keeps its examples in sort_test.
As mentioned in the testing docs, benchmarks are discovered and executed with the -bench
flag, and fuzz tests with the -fuzz flag.
When your project grows into multiple packages, you’ll want to verify that everything works together, not just in isolation. That’s where integration and end-to-end tests come in. They typically live outside the package tree because they often span multiple packages or processes.
myapp/
├── mypkg/
│ ├── greet.go # under package mypkg
│ └── greet_test.go # under package mypkg
└── integration/
└── greet_integration_test.go # under package integration
Here’s what one might look like:
package integration
import (
"testing"
"myapp/mypkg" // Explicitly import the SUT pkg to use its deps
)
func TestGreetFlow(t *testing.T) {
got := mypkg.Greet("Integration")
want := "Hello, Integration"
if got != want {
t.Fatalf("unexpected output: got %q, want %q", got, want)
}
}
Integration tests import real packages and test their interactions. They can spin up servers, connect to databases, or coordinate subsystems. The integration test packages are just like any other package: to communicate with any other package, it needs to be imported explicitly.
You’ll see this pattern in kubernetes, which has a test directory with subpackages like
integration and e2e.
Having a top-level package for testing only makes sense if you’re testing multiple packages.
Otherwise, if you’re writing integration or functional tests for a single package, you can
still nest the tests under the SUT package. In this case, integration tests for mypkg can
be tucked away under mypkg/test.
The general rule of thumb is:
_test package in the same directory._test if needed.The following tree attempts to capture the full picture:
myapp/
├── mypkg/
│ ├── greet.go # mypkg - production code
│ ├── greet_test.go # mypkg - unit & white-box tests
│ ├── greet_external_test.go # mypkg_test - black-box tests
│ └── greet_bench_fuzz_example.go # examples, benchmarks, fuzz
└── integration/
└── greet_integration_test.go # integration or e2e tests
t.Run, you can nest tests,
assign names to cases, and let the runner execute work in parallel by calling t.Parallel
from subtests if needed.
For small suites, a flat set of t.Run calls is usually enough. That’s where I tend to
begin. As the suite grows, your setup and teardown requirements may demand subtest grouping.
There are multiple ways to handle that.
One option is to group subtests using nested t.Run. However, since t.Run supports
arbitrary nesting, it’s easy to create tests that are hard to read and reason about,
especially when each group has its own setup and teardown. When you add calls to
t.Parallel, it can also become unclear which groups of tests run sequentially and which
run in parallel.
This is all a bit hand wavy without examples. We’ll start with the simplest possible subtest grouping and work our way up. Coming up with examples that make the point while still fitting in a blog is tricky, so you’ll have to bear with my toy examples and use a bit of imagination.
Let’s say we’re writing tests for a calculator that, for the sake of argument, can only do addition and multiplication. Instead of going for table-driven tests, we’ll split the tests for addition and multiplication into two groups using subtests. The reason being, let’s say addition and multiplication need different kinds of setup and teardown for some reason.
I know I’m reaching, but bear with me. I’d rather make the point without dragging in mocks, databases, or testcontainers and getting lost in details. But you can find similar setup in a real codebase everywhere where you might be talking to a database and your read and write path have separate test lifecycles.
If we didn’t need different setup and teardown for the two groups, the simplest way to test a system would be through a set of table-driven tests:
func TestCalc(t *testing.T) {
// Common setup and teardown
tests := []struct {
name string
got int
want int
}{
{"1+1=2", 1 + 1, 2},
{"2+3=5", 2 + 3, 5},
{"2*2=4", 2 * 2, 4},
{"3*3=9", 3 * 3, 9},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if tt.got != tt.want {
t.Fatalf("got %d, want %d", tt.got, tt.want)
}
})
}
}
Running the tests returns:
--- PASS: TestCalc (0.00s)
--- PASS: TestCalc/1+1=2 (0.00s)
--- PASS: TestCalc/2+3=5 (0.00s)
--- PASS: TestCalc/2*2=4 (0.00s)
--- PASS: TestCalc/3*3=9 (0.00s)
PASS
Unrolling the tests would give you this. The following is equivalent to the above test suite:
func TestCalc(t *testing.T) {
// Common setup and teardown
// Addition
t.Run("1+1=2", func(t *testing.T) {
if 1+1 != 2 {
t.Fatal("want 2")
}
})
t.Run("2+3=5", func(t *testing.T) {
if 2+3 != 5 {
t.Fatal("want 5")
}
})
// Multiplication
t.Run("2*2=4", func(t *testing.T) {
if 2*2 != 4 {
t.Fatal("want 4")
}
})
t.Run("3*3=9", func(t *testing.T) {
if 3*3 != 9 {
t.Fatal("want 9")
}
})
}
Observe that all the subtests live at the same level. The names of the tests are the indicator of which function of the calculator they’re testing. But this obviously doesn’t allow us to have separate lifecycles for the addition and multiplication groups. There’s no grouping as of now.
t.Run when lifecycle divergesTo allow different setup and teardown for addition and multiplication, we can introduce
grouping by nesting the subtests via t.Run. Notice:
func TestCalc(t *testing.T) {
// Common setup and teardown
t.Run("addition", func(t *testing.T) {
// addition-specific setup
defer func() {
// addition-specific teardown
}()
t.Run("1+1=2", func(t *testing.T) {
if 1+1 != 2 {
t.Fatal("want 2")
}
})
t.Run("2+3=5", func(t *testing.T) {
if 2+3 != 5 {
t.Fatal("want 5")
}
})
})
t.Run("multiplication", func(t *testing.T) {
// multiplication-specific setup
defer func() {
// multiplication-specific teardown
}()
t.Run("2*2=4", func(t *testing.T) {
if 2*2 != 4 {
t.Fatal("want 4")
}
})
t.Run("3*3=9", func(t *testing.T) {
if 3*3 != 9 {
t.Fatal("want 9")
}
})
})
}
In this case, you can run the common setup and teardown in the top-level test function and the groups can have their own lifecycle operations alongside. Introducing the group also allows us to name them properly and they show up when we run the tests:
--- PASS: TestCalc (0.00s)
--- PASS: TestCalc/addition (0.00s)
--- PASS: TestCalc/addition/1+1=2 (0.00s)
--- PASS: TestCalc/addition/2+3=5 (0.00s)
--- PASS: TestCalc/multiplication (0.00s)
--- PASS: TestCalc/multiplication/2*2=4 (0.00s)
--- PASS: TestCalc/multiplication/3*3=9 (0.00s)
PASS
From the output it’s clear which subtests belong to which group. This setup also allows you
to run the groups in parallel by calling t.Parallel in each group.
func TestCalc(t *testing.T) {
// Common setup and teardown
t.Run("addition", func(t *testing.T) {
t.Parallel()
})
t.Run("multiplication", func(t *testing.T) {
t.Parallel()
})
}
Starting with flat subtests and nesting them one extra level with t.Run should suffice in
the majority of cases. Readability of your tests usually starts hurting when you need to
introduce any additional nesting.
I almost always frown when I encounter more than two degrees of nesting in a test suite. On
top of that, if your overly nested subtests start calling t.Parallel then it’s quite
difficult to reason about the test execution flow. Plus, maintaining the lifecycles of the
nested subgroups can get out of hand pretty quickly.
But even when you’re grouping subtests with two degrees of nesting, if the individual test logic starts getting longer, that might start hurting readability. Named functions for the subtests can help here in most cases.
We can rewrite the subtest grouping example of the previous section by extracting subtests into two group-specific functions like this:
func TestCalc(t *testing.T) {
// Common setup and teardown
t.Run("addition", addgroup)
t.Run("multiplication", multgroup)
}
func addgroup(t *testing.T) {
// addition-specific setup
defer func() {
// addition-specific teardown
}()
t.Run("1+1=2", func(t *testing.T) {
if 1+1 != 2 {
t.Fatal("want 2")
}
})
t.Run("2+3=5", func(t *testing.T) {
if 2+3 != 5 {
t.Fatal("want 5")
}
})
}
func multgroup(t *testing.T) {
// multiplication-specific setup
defer func() {
// multiplication-specific teardown
}()
t.Run("2*2=4", func(t *testing.T) {
if 2*2 != 4 {
t.Fatal("want 4")
}
})
t.Run("3*3=9", func(t *testing.T) {
if 3*3 != 9 {
t.Fatal("want 9")
}
})
}
All we did here is extract the groups into their own functions. Other than that this test is
identical to the previous two-degree subtest grouping. You can call t.Parallel from the
subgroup functions:
func TestCalc(t *testing.T) {
// Common setup and teardown
// ...
}
func addgroup(t *testing.T) {
// Run the group in parallel
t.Parallel()
}
func multgroup(t *testing.T) {
// Run the group in parallel
t.Parallel()
}
Or you can bring the t.Parallel at the top-level test function:
func TestCalc(t *testing.T) {
// Common setup and teardown
t.Run("addition", func(t *testing.T) {
t.Parallel()
addgroup(t) // addgroup doesn't have t.Parallel
})
t.Run("multiplication", func(t *testing.T) {
t.Parallel()
multgroup(t) // multgroup doesn't have t.Parallel
})
}
That’s all there is to it. But some people don’t like the manual wiring that we needed to do
in the top-level TestCalc function. Also, in a larger codebase, you’ll need some
discipline to make sure the pattern is followed by others extending the code.
So often people want the subtest groups to be automatically discovered without them having to manually wire them in the main test function. While I’m not a big fan of automagical group discovery, I got curious about it nonetheless. The gRPC-go has a group discovery function that does this.
If we were writing tests inside the grpc-go repository, we could lean on its small helper
package, internal/grpctest, which reflects over a value you pass in, discovers methods
whose names start with Test, and runs each of those as a subtest. Crucially, the helper
also runs setup before and teardown after each discovered test method, which gives you a
clear spot for per-group lifecycle work. The public surface is tiny: RunSubTests(t, x)
plus a default hook carrier Tester that you embed to get Setup and Teardown.
Here is our same calculator suite in that style, as if we were adding tests inside grpc-go:
// NOTE: This import path only works inside the grpc-go repo family.
// External modules cannot import google.golang.org/grpc/internal/*.
package calc
import (
"testing"
"google.golang.org/grpc/internal/grpctest"
)
// CalcSuite: embed grpctest.Tester so we get Setup and Teardown hooks.
// The runner will discover TestAddition and TestMultiplication below.
type CalcSuite struct{ grpctest.Tester }
// TestAddition is discovered because the name starts with "Test".
func (CalcSuite) TestAddition(t *testing.T) {
// addition-specific setup and teardown for this group
defer func() {
// tear down addition fixtures
}()
t.Run("1+1=2", func(t *testing.T) {
if 1+1 != 2 {
t.Fatal("want 2")
}
})
t.Run("2+3=5", func(t *testing.T) {
if 2+3 != 5 {
t.Fatal("want 5")
}
})
}
// A second discovered group.
func (CalcSuite) TestMultiplication(t *testing.T) {
// multiplication-specific setup and teardown for this group
defer func() {
// tear down multiplication fixtures
}()
t.Run("2*2=4", func(t *testing.T) {
if 2*2 != 4 {
t.Fatal("want 4")
}
})
t.Run("3*3=9", func(t *testing.T) {
// call t.Parallel() here if overlapping is safe
if 3*3 != 9 {
t.Fatal("want 9")
}
})
}
// Top-level entry that "go test" sees.
// RunSubTests reflects over CalcSuite,
// then runs Setup, the test method, then Teardown.
func TestCalc(t *testing.T) {
grpctest.RunSubTests(t, CalcSuite{})
}
Outside grpc-go you can’t import google.golang.org/grpc/internal/grpctest because it lives
under an internal/ path. Go’s visibility rule only allows packages within that module tree
to use it. If you want the subtest discoverer, there’s nothing stopping you from blatantly
copying the code. It’s only a few dozen lines and devoid of any dependencies other than the
leak checker. You can drop the file in your tests, remove the leak checker code if you don’t
need that, adjust the import paths, and start using RunSubTests. To avoid repetition, I’ll
leave that as an exercise to the reader.
Another thing to point out is that grpctest.RunSubTests doesn’t change the standard
scheduler; you still opt into concurrency with t.Parallel() where it is safe.
If you like automatic subgroup discovery but want something you can use outside grpc-go, two common options are testify’s suite and Bloomberg’s go-testgroup. Both let you organize tests into named groups and keep per-group setup/teardown close to the cases.
Testify models a suite as a struct with Test* methods and gives you s.Run for subtests
and assertion helpers.
package calc
import (
"testing"
"github.com/stretchr/testify/suite"
)
type CalcSuite struct{ suite.Suite }
func (s *CalcSuite) TestAddition() {
s.Run("1+1=2", func() { s.Equal(2, 1+1) })
s.Run("2+3=5", func() { s.Equal(5, 2+3) })
}
func (s *CalcSuite) TestMultiplication() {
s.Run("2*2=4", func() { s.Equal(4, 2*2) })
s.Run("3*3=9", func() { s.Equal(9, 3*3) })
}
func TestCalc(t *testing.T) {
suite.Run(t, new(CalcSuite))
}
One limitation is that the suite runner doesn’t support using t.Parallel to run the
suite methods (TestAddition, TestMultiplication) in parallel. Bloomberg’s test group
allows you to do that.
Bloomberg’s library also groups by methods, but passes a *testgroup.T and provides two
runners so you can choose serial or parallel execution at the group level.
package calc
import (
"testing"
"github.com/bloomberg/go-testgroup"
)
type CalcGroup struct{}
func (g *CalcGroup) Addition(t *testgroup.T) {
t.Run("1+1=2", func(t *testgroup.T) { t.Equal(2, 1+1) })
t.Run("2+3=5", func(t *testgroup.T) { t.Equal(5, 2+3) })
}
func (g *CalcGroup) Multiplication(t *testgroup.T) {
t.Run("2*2=4", func(t *testgroup.T) { t.Equal(4, 2*2) })
t.Run("3*3=9", func(t *testgroup.T) { t.Equal(9, 3*3) })
}
func TestCalcSerial(t *testing.T) {
testgroup.RunSerially(t, &CalcGroup{})
}
// Or run in parallel.
// Don't call t.Parallel inside methods
func TestCalcParallel(t *testing.T) {
testgroup.RunInParallel(t, &CalcGroup{})
}
RunInParallel handles group-level parallelism for you and documents not to mix in your own
t.Parallel inside those methods.
While there are multiple ways to organize subtest groups, I try to keep them flat for as
long as possible. When grouping becomes necessary, I gradually add a single extra level of
nesting with t.Run.
In larger tests, extracting groups into their own named functions improves readability and maintainability quite a bit. I almost never use reflection-based wiring because that’s one extra bit of code to carry around.
I also tend to eschew pulling in third-party test suites unless I am already working in a codebase that uses them. Tools like testify or go-testgroup require you to define a struct and attach tests to it. I prefer to keep tests as standalone functions. In addition, testing frameworks often develop into mini-languages of their own, which makes onboarding harder. Notice how different the APIs of testify suite and go-testgroup are despite doing pretty much the same thing.
In my experience, even in large codebases, a bit of discipline is usually enough to get by with manual subtest grouping.
]]>While structure often influences architecture and vice versa, this distinction is important. This post is strictly about application structure and not library structure. Library structure is often driven by different design pressures than their app counterparts. There are a ton of canonical examples of good library structure in the stdlib, but it’s app structure where things get a bit more muddy.
At work, I not only write Go in a distributed system environment but also review potential candidates’ assignments in the hiring pipeline. While there is no objectively right or wrong way to structure an app, I do see a common pitfall in candidates’ submissions that is usually frowned upon in a Go application.
App structure should be driven by what it does and not what it’s built with. Let the domain guide the structure, not technology or the current language specific zeitgeist.
Ben Johnson’s Standard Package Layout is a good reference for this. He points out why approaches like monolithic packages, Rails style layouts, or grouping by module don’t fit well in Go. Then he lays out a map where the root package holds domain types, dependencies are grouped in separate packages, and the main package wires everything together.
While Ben’s post is focused on what you should be doing, I want to keep this discussion a bit more open-ended and just talk about one bad pattern that you probably should avoid. The rest of the app structure is subjective and should be driven by requirements. Use your judgement.
The mistake I often see is people making a bunch of generically named packages like
models, controllers, handlers and stuffing everything there. App structure like the
following is quite common:
mystore/
├── controllers/
│ ├── order_controller.go
│ └── user_controller.go
├── models/
│ ├── order.go
│ └── user.go
├── handlers/
│ ├── http_handler.go
│ └── webhook_handler.go
└── main.go
In Go there’s no file level separation, only package level separation. That means everything
under models like order and product lives in the same namespace. The same is true for
controllers and handlers.
Once you put multiple business domains under a generic umbrella, you tie them together. This might make sense in a language like Python where file names are prefixed in the fully qualified import path. In Python you’d import them as follows:
# Identifiers live in the order namespace
from mystore.models import order
# Identifiers live in the http_handler namespace
from mystore.handlers import http_handler
But in Go the import path becomes this:
// Identifiers from order.go, user.go, product.go
// all live in the same namespace
import "mystore/models"
// Identifiers from http_handler.go & webhook_handler.go
// all live in the same namespace
import "mystore/handlers"
There is no file level delineation in Go. If you put different domains under the same
models directory, there is no indication at import time what domain a model belongs to.
The only clue is the identifier name. This isn’t ideal when you want clear separation
between domains.
In Go, packages define your bounded context, not files within a package. Domains should be delineated by top level packages, not by file names.
For your top level business logic, you want package level separation between domains. Order
logic should live in order, user logic should live in user. These packages will be
imported in many places throughout the app, and keeping them separate keeps dependencies
clear.
It could look like this:
mystore/
├── order/ <-- business logic related to the order domain
│ ├── order.go
│ └── service.go
├── user/ <-- business logic related to the user domain
│ ├── user.go
│ └── service.go
└── cmd/ <-- wire everything here
└── mystore/
└── main.go
Each domain owns its own logic and optional adapters. If you need to find order related
code, you go to order. If you need user code, you go to user. Nothing is smooshed
together under a generic bucket.
The details around how you layer your app can differ based on requirements, but the important point is that your top level directories shouldn’t just be generic buckets containing all domains. That makes navigation harder. A better approach is letting the domain guide the structure and only layering in technology when it matters.
You can place your transport concerns alongside the top level packages. A top level http
package can hold handlers that import service functions from the domain packages. You can
put all handlers under http or split them into http/order and http/user. Both are
valid choices. If you put all handlers under http, that’s fine because they are usually
imported in one place where you wire routes. The same is true for database adapters. You can
put them all under postgres or split them into postgres/order and postgres/user. Both
patterns are acceptable. The key difference is that domains need package level separation,
while technology packages can be grouped because they are only wired at the edge.
mystore/
├── order/
│ ├── order.go
│ └── service.go
├── user/
│ ├── user.go
│ └── service.go
├── http/ <-- lumping all the handlers here is fine
│ ├── order_handler.go
│ └── user_handler.go
├── postgres/ <-- this is fine, but you can create sub pkgs too
│ ├── order_repo.go
│ └── user_repo.go
└── cmd/
└── server/
└── main.go
But depending on the complexity of your app, this is also absolutely fine:
mystore/
├── order/
│ ├── order.go
│ └── service.go
├── user/
│ ├── user.go
│ └── service.go
├── http/ <-- handlers are split by domain here
│ ├── order/
│ │ └── handler.go
│ └── user/
│ └── handler.go
├── postgres/ <-- repos are split by domain here
│ ├── order/
│ │ └── repo.go
│ └── user/
│ └── repo.go
└── cmd/
└── server/
└── main.go
The rule of thumb is that top level domains should never import anything from technology
folders like http or postgres. Instead, http and postgres should always import from
domain packages. You can add a linter to enforce this rule but since Go doesn’t allow import
cycles, this is automatically enforced by the compiler.
+-----------+ +-----------+
| order | | user |
+-----------+ +-----------+
^ ^
| |
+------------------------------+
| http postgres |
+------------------------------+
^
|
+---------+
| cmd |
+---------+
Domains sit at the top. Technology packages depend on them, never the other way around. The
cmd package wires everything together. This keeps the graph simple and keeps domains
independent.
Astute readers might notice that I have left out any discussion around the internal
directory. This is intentional. Depending on your requirements, you might opt in for an
internal directory or not. This isn’t important for our discussion. The main point I
wanted to emphasize is that technology or architecture patterns shouldn’t guide your app
structure. It should be based on something more persistent and nothing is more persistent
than your application’s domain.
My colleague Matthias Doepmann recently fired a shot at AI-generated tests that don’t validate the behavior of the System Under Test (SUT) but instead create needless ceremony around internal implementations. At best, these tests give a shallow illusion of confidence in the system’s correctness while breaking at the smallest change. At worst, they remain green even when the SUT’s behavior changes.
In practice, they add maintenance overhead and drag down code reviews. The frustration in that post wasn’t about violating some abstract testing philosophy. It came from having to wade through countless implementation-checking tests churned out by LLMs across components of a real, large-scale distributed system.
I think the problem persists for three reasons:
The general theme when writing unit tests should be checking the behavior of the system, not the scaffolding of its implementation. It doesn’t matter which method called which, how many times, or with what arguments.
What matters is: if you give the SUT some input, does it return the expected output? In a stateful system, does the input cause the system to mutate some persistence layer in the expected way? That persistence layer doesn’t always need to be a real database; it could be an in-memory buffer.
In scenarios where your code invokes external systems, it is more useful to test your system with canned responses from upstream calls rather than testing which method is being called.
The salient point is: test outcomes, not implementation details. As the book Software Engineering at Google puts it: test state, not interactions:
With state testing, you observe the system itself to see what it looks like after invoking with it. With interaction testing, you instead check that the system took an expected sequence of actions on its collaborators in response to invoking it. Many tests will perform a combination of state and interaction validation.
And the guidance that follows:
By far the most important way to ensure this is to write tests that invoke the system being tested in the same way its users would; that is, make calls against its public API rather than its implementation details. If tests work the same way as the system’s users, by definition, change that breaks a test might also break a user.
I think the first step in the right direction is to accept that LLMs can’t substitute for thought. The first few critical tests in your systems shouldn’t be written by LLMs and you must vet the tests churned out by the genie that wants to leap. Next up, you can often get away without a mocking library and more often than not, they improve the quality and maintainability of your tests.
Mocking libraries come with their own idiosyncratic syntax and workflows. On most occasions, handwritten fakes are better than mocks. I’ll use Go to make my point here because that’s what I write the most these days, but the lesson applies to other languages too.
Consider a simple UserService that depends on a DB interface. Its job is to delegate
user creation to the database and return any error to the caller:
// usersvc/usersvc.go
package usersvc
import "errors"
var ErrDuplicate = errors.New("duplicate user")
type DB interface {
InsertUser(name string) error
ListUsers() []string
}
type UserService struct {
db DB
}
func NewUserService(db DB) *UserService {
return &UserService{db: db}
}
// Baseline behavior: delegate to DB and surface errors to callers.
func (s *UserService) CreateUser(name string) error {
return s.db.InsertUser(name)
}
A mocking tool such as mockery can generate a mock implementation of the DB interface.
The generated code records calls and arguments so that tests can later assert whether the
expected interactions happened:
// usersvc/mocks/mock_db.go
// generated by:
// mockery --name=DB --dir=usersvc --output=usersvc/mocks \
// --outpkg=mocks --with-expecter
// simplified to remove unnecessary details
package mocks
import "github.com/stretchr/testify/mock"
type MockDB struct{ mock.Mock }
func (m *MockDB) InsertUser(name string) error {
args := m.Called(name)
return args.Error(0)
}
func (m *MockDB) ListUsers() []string {
args := m.Called()
return args.Get(0).([]string)
}
Using this mock, a test can be written to check that CreateUser interacts with the
dependency in the expected way:
// usersvc/usersvc_mock_test.go
package usersvc_test
import (
"testing"
"github.com/stretchr/testify/require"
"example.com/app/usersvc"
"example.com/app/usersvc/mocks"
)
func TestUserService_CreateUser(t *testing.T) {
db := mocks.NewMockDB(t)
svc := usersvc.NewUserService(db)
// Exact interaction expected.
db.EXPECT().InsertUser("alice").Return(nil).Once()
// Exercise public API.
err := svc.CreateUser("alice")
require.NoError(t, err)
// Verify the interaction occurred.
db.AssertExpectations(t)
}
This works mechanically, but it breaks down in practice:
It checks the collaborator call, not the result
A useful test would assert that “alice” was actually added or that a duplicate error was
returned. This one only verifies that InsertUser("alice") was invoked once.
It breaks on harmless refactors
If the database method is renamed while keeping the same semantics, callers see no difference but the test fails:
// usersvc/usersvc.go (harmless refactor, behavior unchanged)
package usersvc
type DB interface {
UpsertUser(name string) error // was InsertUser
ListUsers() []string
}
func (s *UserService) CreateUser(name string) error {
return s.db.UpsertUser(name) // same public behavior
}
The mock-based test no longer compiles or needs rewiring, even though the public behavior didn’t change.
And worse, it survives real bugs
If an error is accidentally swallowed, callers get the wrong signal but the test still passes:
// usersvc/usersvc.go (buggy refactor: behavior changed)
package usersvc
func (s *UserService) CreateUser(name string) error {
_ = s.db.InsertUser(name) // ignore error by mistake
return nil // callers think it succeeded
}
A real DB or an in-memory fake would raise a constraint error that should propagate. The mock test goes green anyway because it only checked the call path.
The common thread is that mocks lock tests to implementation details. They don’t protect the behavior that real users rely on.
A better approach is to keep the same interface but back it with a handwritten fake. The fake encodes the domain rules you care about, and tests can focus on outcomes instead of verifying which collaborator methods were called.
Here, we’re hand writing the fake implementation of the DB interface instead of generating
it via a mockgen library.
// usersvc/usersvc_fake_test.go
package usersvc_test
import "example.com/app/usersvc"
type FakeDB struct {
seen map[string]struct{}
order []string
}
func NewFakeDB() *FakeDB {
return &FakeDB{seen: make(map[string]struct{})}
}
func (f *FakeDB) InsertUser(name string) error {
if _, ok := f.seen[name]; ok {
return usersvc.ErrDuplicate
}
f.seen[name] = struct{}{}
f.order = append(f.order, name)
return nil
}
func (f *FakeDB) ListUsers() []string {
out := make([]string, len(f.order))
copy(out, f.order)
return out
}
Tests with the fake read like a statement of expected behavior:
// usersvc/usersvc_fake_test.go
package usersvc_test
import (
"testing"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"example.com/app/usersvc"
)
func TestUserService_CreateUser(t *testing.T) {
db := NewFakeDB()
svc := usersvc.NewUserService(db)
require.NoError(t, svc.CreateUser("alice"))
assert.Equal(t, []string{"alice"}, db.ListUsers()) // outcome observed
}
func TestUserService_CreateUser_DuplicateSurfaces(t *testing.T) {
db := NewFakeDB()
svc := usersvc.NewUserService(db)
require.NoError(t, svc.CreateUser("alice"))
err := svc.CreateUser("alice")
require.ErrorIs(t, err, usersvc.ErrDuplicate) // behavior enforced
assert.Equal(t, []string{"alice"}, db.ListUsers()) // state unchanged
}
This avoids the fragility of mocks. The tests survive harmless refactors, fail when behavior changes, and stay readable without a mocking DSL.
But the cost is maintaining the fake as the interface evolves. However, in practice, that’s still easier than constantly updating brittle mock expectations and occasionally dealing with the mock library’s lengthy migration workflow.
Sometimes the right move is to test against a real database running in a container. That is still state testing, just at a higher fidelity. The tradeoff is speed: you get stronger confidence in behavior, but the tests run slower.
Most of the time, handwritten in-memory fakes are what you need, and most tests should stick to those. When you do need the same behavior you would see in production, tools like testcontainers let you spin up databases, queues, or caches inside containers. Your tests can then call the SUT normally, with its configuration pointing at the containerized service, just as production code would connect to a production resource.
This is not a rally against using LLMs for tests. But the seed tests, the first handful that set the standard, need to come from you. They define what correctness means in your system and give the ensuing tests a model to follow. If you hand that job to an LLM, you give up the chance to shape how the rest of the suite grows.
This isn’t to disparage mocking libraries either. But I have seen people armed with overzealous LLMs and mocks wreak havoc on a test suite and then unironically ask reviewers to review the mess. Instead of validating behavior, the suite fills up with fragile interaction checks that break on refactors and stay green through real bugs.
More often than not, you can skip mocking libraries and rely on handwritten fakes that check the behavior of the SUT instead of its interactions. The next person that needs to read and extend your tests might thank you for that.
]]>The pattern usually looks like this:
result to its own unbuffered channelThe trap is the early return. With an unbuffered channel, a send blocks until a receiver is ready. If you return before reading from the remaining channels, the goroutines writing to them block forever. That’s a goroutine leak.
Here’s how the bug appears in a tiny example: one worker intentionally fails, causing the
main goroutine to bail early. That early return skips the receive from ch2, leaving the
sender on ch2 stuck.
type result struct{ err error }
func Example() error {
ch1 := make(chan result) // unbuffered
ch2 := make(chan result) // unbuffered
// Simulate a failing worker by sending an error into ch1.
// This is intentional to trigger the early return below.
go func() { ch1 <- result{err: fmt.Errorf("oops")} }()
// Simulate a successful worker that will try to send into ch2.
go func() { ch2 <- result{err: nil} }()
// Receive the first result.
res1 := <-ch1
if res1.err != nil {
// We return right away because of the error.
// Because we never read from ch2, the goroutine sending to ch2
// is now blocked forever on its send. That goroutine leaks.
return res1.err
}
// This receive is skipped on the error path above.
res2 := <-ch2
if res2.err != nil {
return res2.err
}
return nil
}
One simple fix is to make sure you always read from both channels before you decide what to do. This guarantees that every send has a matching receive and no goroutine gets stuck:
func ExampleDrain() error {
ch1 := make(chan result)
ch2 := make(chan result)
go func() { ch1 <- result{err: fmt.Errorf("oops")} }() // same failure
go func() { ch2 <- result{err: nil} }() // same success
// Always receive both. Both sends now complete.
res1 := <-ch1
res2 := <-ch2
if res1.err != nil {
return res1.err
}
if res2.err != nil {
return res2.err
}
return nil
}
This is safe but it means you always wait for both workers even when the first one already failed and the second result is irrelevant. If you want to return early without leaking, another option is to use buffered channels so the producers don’t block on send. A buffer of size one is enough for this pattern.
func ExampleBuffered() error {
ch1 := make(chan result, 1) // buffered so sends do not block
ch2 := make(chan result, 1)
go func() { ch1 <- result{err: fmt.Errorf("oops")} }() // failure
go func() { ch2 <- result{err: nil} }() // success
// Receive the first result and decide.
res1 := <-ch1
if res1.err != nil {
// Safe to return early. The send to ch2 already completed
// into its buffer even though we have not read it yet.
return res1.err
}
// Still read ch2 to consume its buffered value
res2 := <-ch2
if res2.err != nil {
return res2.err
}
return nil
}
Buffered channels remove the blocked send, but they also make it easier to forget that a second result exists at all. If that second value carries data you must process, you should still receive it. If it is truly fire and forget, buffering is fine.
Often the cleanest approach is to drop the channel plumbing when you only need to run tasks and aggregate errors. The errgroup package lets each goroutine return an error while the group does the waiting. There is nothing to forget to receive, so there is nothing to leak.
import (
"fmt"
"golang.org/x/sync/errgroup"
)
func ExampleErrgroup() error {
var g errgroup.Group
// Task 1 fails and returns an error.
g.Go(func() error {
return fmt.Errorf("oops")
})
// Task 2 succeeds.
g.Go(func() error {
return nil
})
// Wait waits for both tasks and returns the first error, if any.
return g.Wait()
}
Sometimes you also want peers to stop once one task fails. errgroup.WithContext gives you
a context that gets canceled as soon as any task returns an error. You pass that context
into your workers and have them check ctx.Done() so they can exit quickly.
import (
"context"
"fmt"
"time"
"golang.org/x/sync/errgroup"
)
func ExampleErrgroupWithContext() error {
// When any task returns an error, ctx is canceled.
g, ctx := errgroup.WithContext(context.Background())
// Task 1 fails quickly to simulate an early error.
g.Go(func() error {
return fmt.Errorf("oops")
})
// Task 2 is long running but cooperates with cancellation.
g.Go(func() error {
for {
select {
case <-ctx.Done():
// Exits because Task 1 failed and canceled the context.
return ctx.Err()
default:
time.Sleep(10 * time.Millisecond)
}
}
})
return g.Wait()
}
At this point it is natural to ask if tools can catch the original bug for you. go vet
cannot. Vet is static analysis that runs at build time. Whether a send blocks depends on
runtime control flow and timing. Vet cannot prove that the function returns before a
particular receive in a general way, so it doesn’t flag this pattern.
go test -race cannot either. The race detector detects unsynchronized concurrent memory
access. A goroutine stuck on a channel send isn’t a data race. You may see a test hang until
timeout, but the tool won’t point to a leaking goroutine.
You can turn this into a failing test with goleak from Uber. goleak fails if goroutines
are still alive when a test ends. It snapshots all goroutines via the runtime, filters out
the standard background ones, and reports the rest. Wire it into a test that triggers the
early return and you will see the blocked sender’s stack in the output.
Here is a test that leaks and fails:
package example_test
import (
"fmt"
"testing"
"go.uber.org/goleak"
)
type result struct{ err error }
func buggyEarlyReturn() error {
ch1 := make(chan result)
ch2 := make(chan result)
// Force the early-return path by sending an error on ch1.
go func() { ch1 <- result{err: fmt.Errorf("oops")} }()
// This send will block forever on the failing path
// because nobody receives ch2.
go func() { ch2 <- result{err: nil} }()
r1 := <-ch1
if r1.err != nil {
return r1.err // leak: ch2 sender is stuck
}
_ = <-ch2
return nil
}
func TestBuggyLeaks(t *testing.T) {
// fails if any goroutines are stuck at test end
defer goleak.VerifyNone(t)
_ = buggyEarlyReturn()
}
This test fails and prints the goroutine stack stuck in the send to ch2.
=== RUN TestBuggyLeaks
main_test.go:34: found unexpected goroutines:
[Goroutine 24 in state chan send,
with thing.buggyEarlyReturn.func2 on top of the stack:
thing.buggyEarlyReturn.func2()
.../main_test.go:20 +0x28
created by thing.buggyEarlyReturn in goroutine 22
.../main_test.go:20 +0xc0
]
--- FAIL: TestBuggyLeaks (0.44s)
FAIL
exit status 1
If you switch the implementation to a fixed version, the test passes. For example, the draining fix:
func fixedDrain() error {
ch1 := make(chan result)
ch2 := make(chan result)
go func() { ch1 <- result{err: fmt.Errorf("oops")} }()
go func() { ch2 <- result{err: nil} }()
r1 := <-ch1
r2 := <-ch2
if r1.err != nil {
return r1.err
}
if r2.err != nil {
return r2.err
}
return nil
}
func TestFixedNoLeaks(t *testing.T) {
defer goleak.VerifyNone(t)
_ = fixedDrain()
}
If you prefer suite wide enforcement, add goleak to your TestMain. This way your entire
test run fails if any test leaks goroutines.
package main
import (
"os"
"testing"
"go.uber.org/goleak"
)
func TestMain(m *testing.M) {
// VerifyTestMain wraps the whole test run
// and fails if any goroutines are left behind.
goleak.VerifyTestMain(m)
}
If you start goroutines that send on channels, think carefully about early returns. An unbuffered send waits for a receive, and if you return before that receive happens, you’ve leaked a goroutine.
You can avoid this by:
errgroup, with or without context, so tasks return errors and cooperate on
cancellationAdd goleak to your tests so leaks surface early during development.
]]>By lifecycle I mean the usual setup and teardown hooks or fixtures that are common in other languages. I think this is a good thing because you don’t need to pick up many different framework-specific workflows for something so fundamental.
Go gives you enough hooks to handle this with less ceremony. But it can still be tricky to figure out the right conventions for setup and teardown that don’t look odd to other Gophers, especially if you haven’t written Go for a while. This text explores some common ways to do lifecycle management in your Go tests.
Before we cover multiple testing scenarios, it’s useful to understand how Go’s test harness actually runs your tests.
When you type go test, Go doesn’t interpret test files directly. It collects all the
_test.go files in a package, compiles them together with the rest of the package, and
produces a temporary binary. That binary contains both your code and your tests, along with
a small harness that drives them. The harness then runs the binary and reports results.
From the “go test” command doc:
“go test” automates testing the packages named by the import paths. […] recompiles each package along with any files with names matching the file pattern “*_test.go”.
Inside each package, the harness looks for test functions. A function qualifies if it has the form:
func TestXxx(t *testing.T)
where Xxx starts with an uppercase letter. There are no annotations or decorators, just
naming convention. Functions that don’t match this signature are ignored.
By default, the harness runs tests sequentially. If you want concurrency, you can opt in at
the test level. Calling t.Parallel() inside a test signals that this test may run
alongside others in the same package that also call t.Parallel(). Tests that don’t opt in
remain strictly ordered.
Every package with tests produces its own binary, and those binaries are run independently. There is no global suite that links packages together, so setup and teardown only exist inside one package’s process. If you have ten packages containing tests, you get ten binaries, each with its own lifecycle.
For example:
project/
├── go.mod
├── db/
│ ├── db.go
│ └── db_test.go
└── api/
├── api.go
└── api_test.go
Running go test ./... produces two binaries: one for db and one for api. Each binary
bundles the package code and its tests, and each binary runs on its own. The harness
aggregates the results and prints a combined report, but execution itself is confined to the
package.
It is important to note that there is no file-level scope. All _test.go files in a package
are merged into a single binary, so there is no way to run setup once per file. Similarly,
there is no cross-package scope. Go does not let you set up once for all tests in a module
or tear down after the last package finishes. If you need orchestration across packages, it
has to happen outside of go test, for example in a shell script or a CI pipeline step.
With this background, we can now look at the lifecycle hooks Go does provide. They apply at three levels: per test function, per group of subtests, and per package.
Typically you need to perform setup and teardown before and after:
The smallest scope is the test function itself. You create resources at the start of the
test and clean them up when it ends. This pattern is common when you want each test to run
against a fresh state with no leakage from other tests. The idiomatic way in Go is to wrap
the setup in a helper and register the cleanup with t.Cleanup.
type TestDB struct{}
// newTestDB sets up a fresh database for a single test
func newTestDB(t *testing.T) *TestDB {
t.Helper()
db := &TestDB{}
// cleanup tied to the function scope
t.Cleanup(func() {
db.Close()
})
return db
}
func (db *TestDB) Close() {}
func (db *TestDB) Insert(k, v string) error { return nil }
func (db *TestDB) Query(k string) (string, error) { return "value", nil }
func TestInsert(t *testing.T) {
db := newTestDB(t) // new DB created for this test only
if err := db.Insert("foo", "bar"); err != nil {
t.Fatalf("insert failed: %v", err)
}
}
In this example, TestInsert gets its own new database. The cleanup registered with
t.Cleanup makes sure the database is closed when the test finishes. The resource is never
shared with other tests, which gives you strong isolation. The downside is that if your
setup is expensive, it will run before and after every test function, which can slow things
down.
The next scope is a group of subtests. Instead of repeating setup for every test, you create the resource once in the parent test and share it with the children. Teardown runs when the parent finishes. This works well when you want to test a flow of operations against the same shared state.
func TestUserFlow(t *testing.T) {
// new DB created once for this group
// t.Cleanup() gets called after all the subtests finish and
// the parent returns
db := newTestDB(t)
t.Run("insert user", func(t *testing.T) {
if err := db.Insert("user:1", "alice"); err != nil {
t.Fatal(err)
}
})
t.Run("query user", func(t *testing.T) {
val, err := db.Query("user:1")
if err != nil {
t.Fatal(err)
}
if val != "alice" {
t.Fatalf("expected alice, got %s", val)
}
})
}
Here both subtests share the same database, and the cleanup runs once when TestUserFlow
ends. This is useful when your tests need to act on shared state, like inserting a record
and then querying it. The trade-off is that the tests are no longer fully independent, and
if one subtest leaves the database in a bad state, others may fail in unexpected ways.
TestMainThe broadest scope is the package. If you define TestMain, the test harness calls it
instead of running the tests directly. You can perform setup, run all the tests, and then
perform teardown. This allows you to reuse an expensive resource across all tests in the
package.
var globalDB *TestDB
func TestMain(m *testing.M) {
globalDB = &TestDB{} // setup once for the entire package
code := m.Run()
globalDB.Close() // teardown after all tests
os.Exit(code)
}
func TestGlobalInsert(t *testing.T) {
if err := globalDB.Insert("k", "v"); err != nil {
t.Fatal(err)
}
}
Here the database is created once and reused by all tests in the package. The teardown runs when everything is finished. This can make your tests run much faster if setup is expensive, but you pay for it in global (package wide) state. If one test mutates the shared resource in an unexpected way, other tests may start failing, and debugging those failures can be difficult.
Also, remember your setup and teardown are still package bound, meaning each package can
have its own TestMain. Reasoning about their order can get out of hand quickly. Make sure
your tests never depends on the order of TestMain execution. Treat these like init
functions and use them sparingly.
These three scopes are not mutually exclusive. You can combine them when you need different
levels of control. A typical pattern is to have TestMain start a package-wide service,
create a shared schema or fixture in a parent test for a group of related subtests, and then
still use per-test setup inside individual subtests for fine-grained isolation. Each call to
newTestDB creates a fresh database, so using it at different levels produces different
resources with different lifetimes.
func TestOrders(t *testing.T) {
schema := newTestDB(t) // group-level DB shared across subtests
t.Run("create order", func(t *testing.T) {
db := newTestDB(t) // per-test DB, fresh for this subtest only
db.Insert("order:1", "widget")
})
t.Run("query order", func(t *testing.T) {
// uses the group-level DB, so the state persists across subtests
schema.Insert("order:1", "widget")
val, _ := schema.Query("order:1")
if val != "widget" {
t.Fatalf("expected widget, got %s", val)
}
})
}
In this example, TestMain could be running a package-wide database server. The parent test
TestOrders sets up a schema that is shared across its subtests. Inside, one subtest spins
up its own per-test database to work in isolation, while another uses the shared schema to
test how state persists across operations.
The combination of package, group, and function scopes gives you flexibility: reuse expensive resources when you need to, and isolate state when correctness depends on it. However, combining scopes can be hard to reason about when you have many different subtests under a single parent that are also interacting with some global state. I tend to avoid this whenever possible.
Most of your setup and teardown should happen at the function level. That gives you the strongest isolation and keeps each test self-contained.
The next most useful pattern is at the subtest group level, where you create a resource once in a parent test and let its children share it. Cleanup runs when the parent finishes, which makes sense when you really do want that shared state.
Package-level setup through TestMain should be rare. It is tempting when setup is
expensive, but global state is the fastest way to end up with brittle tests. Mixing
different scopes is possible, but usually creates more confusion than clarity, so reach for
it only when you have no better option.
Usually in Go, people make a package called external or http and stash the logic of
communicating with external services there. Then the business logic depends on the
external package to invoke the RPC call. This is already better than directly making RPC
calls inside your service functions, as that would make these two separate concerns
(business logic and external-service wrangling) tightly coupled. Testing these concerns in
isolation, therefore, would be a lot harder.
While this is a fairly common practice, I was looking for a canonical name for this pattern to talk about it in a less hand-wavy way. Turns out Martin Fowler wrote a blog post on it a few moons ago, and he calls it the Gateway pattern. He explores the philosophy in more detail and gives some examples in JS. However, I thought that Gophers could benefit from a few examples to showcase how it translates to Go. Plus, I wanted to reify the following axiom:
High-level modules should not depend on low-level modules. Both should depend on abstractions. Abstractions should not depend on details. Details should depend on abstractions.
– Dependency inversion principle (D in SOLID), Uncle Bob
In this scenario, our business logic in the order package is the high-level module and
external is the low-level module, as the latter concerns itself with transport details.
Inside external, we could communicate with the external dependencies via either HTTP or
gRPC. But that’s an implementation detail and shouldn’t make any difference to the
high-level order package.
order will communicate with external via a common interface. This is how we satisfy the
“both should depend on abstractions” part of the ethos.
Our app layout looks like this:
yourapp/
├── cmd/ # wire up the deps
│ └── main.go
├── order/ # business logic in the service functions
│ ├── service.go
│ └── service_test.go
├── external/ # code to communicate with external deps
│ └── stripe/
│ ├── gateway.go
│ ├── mock_gateway.go
│ └── gateway_test.go
└── go.mod / go.sum
Let’s walk through the flow from the bottom up. Think about walking back from the edge to the core, as in Alistair Cockburn’s Hexagonal Architecture lingo where edge represents the transport logic and core implies the business concerns.
The Stripe implementation lives in external/stripe/gateway.go. For simplicity’s sake,
we’re pretending to call the Stripe API over HTTP, but this could be a gRPC call to another
service.
// external/stripe/gateway.go
package stripe
import "fmt"
type StripeGateway struct {
APIKey string
}
func NewStripeGateway(apiKey string) *StripeGateway {
return &StripeGateway{APIKey: apiKey}
}
// Handle all the details of making HTTP calls to the Stripe service here.
func (s *StripeGateway) Charge(
amount int64, currency string, source string) (string, error) {
fmt.Printf(
"[Stripe] Charging %d %s to card %s\n",
amount, currency, source,
)
return "txn_live_123", nil
}
// Make another HTTP call to the Stripe service to perform a refund.
func (s *StripeGateway) Refund(transactionID string) error {
fmt.Printf("[Stripe] Refunding transaction %s\n", transactionID)
return nil
}
Notice that the stripe package handles the details of communicating with the Stripe
endpoint, but it doesn’t export any interface for the higher-level module to use. This is
intentional.
In Go, the general advice is that the consumer should define the interface they want, not the provider.
Go interfaces generally belong in the package that uses values of the interface type, not the package that implements those values.
– Go code review comments
That gives the consumer full control over what it wants to depend on, and nothing more. You don’t accidentally couple your code to a bloated interface just because the implementation provided one. You define exactly the shape you need and mock that in your tests.
Clients should not be forced to depend on methods they do not use.
– Interface segregation principle (I in SOLID), Uncle Bob
So, in the order package, we define a tiny private interface that reflects the use case.
// order/service.go
package order
// The order service only requires the Charge method of a payment gateway.
// So we define a tiny interface here on the consumer side rather
// than on the producer side
type paymentGateway interface {
Charge(amount int64, currency string, source string) (string, error)
}
type Service struct {
gateway paymentGateway
}
// Pass the Stripe implementation of paymentGateway at runtime here.
func NewService(gateway paymentGateway) *Service {
return &Service{gateway: gateway}
}
// In production, this calls .Charge on the Stripe implementation.
// During tests, it calls .Charge on a mock gateway.
func (s *Service) Checkout(amount int64, source string) error {
_, err := s.gateway.Charge(amount, "USD", source)
return err
}
The order service doesn’t know or care which implementation of the gateway it’s using to
perform some action. It just knows it can call Charge on the provided gateway type. It
doesn’t need to care about the Refund method on the Stripe gateway implementation. Also,
the paymentGateway interface is bound to the order package, so we’re not polluting the
API surface with a bunch of tiny interfaces.
Now, when testing the service logic, you just need to write a tiny mock implementation of
paymentGateway and pass it to order.Service. You don’t need to reach into the
external/stripe package or wire up anything complicated. You can place the fake right next
to your service test. Since interface implementations in Go are implicitly satisfied,
everything just works without much fuss.
// order/service_test.go
package order_test
import (
"testing"
"yourapp/order"
)
type mockGateway struct {
calledAmount int64
calledSource string
}
func (m *mockGateway) Charge(
amount int64, currency, source string) (string, error) {
m.calledAmount = amount
m.calledSource = source
return "txn_mock", nil
}
func TestCheckoutCallsCharge(t *testing.T) {
mock := &mockGateway{}
svc := order.NewService(mock)
err := svc.Checkout(1000, "test_source_abc")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if mock.calledAmount != 1000 {
t.Errorf("expected amount 1000, got %d", mock.calledAmount)
}
if mock.calledSource != "test_source_abc" {
t.Errorf("want source test_source_abc, got %s",
mock.calledSource)
}
}
The test is focused only on what matters: Does the service call Charge with the correct
arguments? We’re not testing Stripe here. That’s its own concern.
You can still write tests for the Stripe client if you want. You’d do that in
external/stripe/gateway_test.go.
// external/stripe/gateway_test.go
package stripe_test
import (
"testing"
"yourapp/external/stripe"
)
func TestStripeGateway_Charge(t *testing.T) {
gw := stripe.NewStripeGateway("dummy-key")
txn, err := gw.Charge(1000, "USD", "tok_abc")
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if txn == "" {
t.Fatal("expected transaction ID, got empty string")
}
}
Finally, everything is wired together in cmd/main.go.
// cmd/main.go
package main
import (
"yourapp/external/stripe"
"yourapp/order"
)
func main() {
stripeGw := stripe.NewStripeGateway("live-api-key")
// Passing the real Stripe gateway to the order service.
orderSvc := order.NewService(stripeGw)
_ = orderSvc.Checkout(5000, "tok_live_card_xyz")
}
It’s also common to call gateways “client.” Some people prefer that name. However, I think client is way overloaded, which makes it hard to discuss the pattern clearly. There’s the HTTP client, the gRPC client, and then your own client that wraps these. It gets confusing fast. I prefer “gateway,” as Martin Fowler used in his original text.
In Go context, the core idea is that a service function uses a locally defined gateway interface to communicate with external gateway providers. This way, the service and the external providers are unaware of each other’s existence and can be tested independently.
]]>So where does X come from?
I like to rely on Go’s standard tooling so that integration and snapshot tests can live right beside ordinary unit tests. Because I usually run these heavier tests in testcontainers, I don’t always want them running while I’m iterating on a feature or chasing a bug. So I need to enable them in an optional manner.
To fetch the X and conditionally run some tests, you’ll typically see three approaches:
RUN_INTEGRATION=1) and skip itself if it’s absent.go test flags (my preferred approach) – define your own flags so you can
run, for example, go test -run Integration -integration.Build tags are special comments you place at the top of a .go file to tell Go to include
that file only when certain tags are set during the build. This is how they typically look:
//go:build snapshot
package main
import "testing"
func TestSnapshot(t *testing.T) {
t.Log("running snapshot")
}
This file will only be compiled and included when you run:
go test -tags=snapshot
If you don’t pass the tag, the file is skipped entirely during the build. Go won’t even see the test.
The upside is that it gives you a clean separation. You can group slow tests or environment-dependent tests into their own files. But the downsides add up quickly.
First, there’s no way to discover which tags are used without grepping through the codebase.
Go itself won’t tell you. go help test doesn’t mention them. There’s no built-in list or
summary. You need to solely depend on documentation.
Second, build tags are applied per file, not per package. That means if even one test in a file is guarded by a tag, the entire file is excluded unless the tag is passed. This makes it difficult to mix optional and always-on tests in the same file.
And third, once you have more than a couple of tags, managing them becomes guesswork. You end up running things like:
go test -tags=slow,mock,external
But you no longer remember what each one does or what combinations are safe. There’s no validation. It gets messy fast.
Environment variables let you control test behavior at runtime. You don’t need to recompile anything, and you can pass them inline when running tests.
Here’s a typical example:
import "os"
func TestSnapshot(t *testing.T) {
if os.Getenv("SNAPSHOT") != "1" {
t.Skip("set SNAPSHOT=1 to run this test")
}
t.Log("running snapshot")
}
You run it like:
SNAPSHOT=1 go test -v
This is more dynamic than build tags. You don’t have to split tests into separate files, and you don’t have to rebuild with special flags. More importantly, the test itself can detect when the environment variable is missing and tell you what to do. It can skip itself and print a message like “set SNAPSHOT=1 to run this test.” That feedback loop is helpful.
But the discovery problem remains. There’s no built-in way to ask, “what environment variables does this test suite support?” You still have to read the code to find out.
It can get worse if the check is buried deep in a helper. Maybe some setup logic does:
if os.Getenv("SNAPSHOT") == "1" {
useRealService()
}
Now the test runs, but the behavior changes silently based on the environment. Nothing in the test output tells you that the envvar was involved. You may not even realize that you’re running in a different mode.
And just like with build tags, there’s no central registry. No docs or summary. You can only hope someone left a good comment or wrote it down somewhere.
The cleanest and most discoverable way to control optional test behavior in Go is by
defining your own test flags. They’re typed, explicit, and work well with Go’s built-in
tooling. Instead of toggling tests with magic file-level build tags or invisible environment
variables, you can wire up test configuration using the flag package, just like any other
Go binary.
There are two common approaches for defining test flags:
TestMaininit().Both approaches register the flag in the global flag set, so every test in the package can
see the value once parsing has happened. The trade-off is indirection versus locality:
TestMain centralizes all flags in one place, while file-level init() keeps each flag
next to the code that cares about it.
Here’s how it looks with TestMain:
package snapshot_test
import (
"flag"
"os"
"testing"
)
var snapshot = flag.Bool("snapshot", false, "run snapshot tests")
func TestMain(m *testing.M) {
flag.Parse()
os.Exit(m.Run())
}
func TestSnapshot(t *testing.T) {
if !*snapshot {
t.Skip("pass -snapshot to run this test")
}
t.Log("running snapshot")
}
And here’s the equivalent using init() to keep everything in the same file:
package snapshot_test
import (
"flag"
"testing"
)
var snapshot bool
func init() {
flag.BoolVar(&snapshot, "snapshot", false, "run snapshot tests")
}
func TestSnapshot(t *testing.T) {
if !snapshot {
t.Skip("pass -snapshot to run this test")
}
t.Log("running snapshot")
}
Once you’ve defined a flag, you run the snapshot tests like this:
go test -v -snapshot
You can also list all the flags using:
go test -v -args -h
This prints all registered flags, including your own:
-snapshot
run snapshot tests
-test.v
verbose: print all tests as they are run.
-test.run
run only those tests and examples matching the regular expression.
# ...
A detail about names: built-in flags show up in the help output with a test. prefix
(-test.v, -test.run, -test.timeout), yet you pass them without that prefix (-v,
-run, -timeout) while running tests. The Go tool strips test. for you. Custom flags
don’t get this treatment. Whatever string you register is the exact string you must pass. If
you register snapshot you run:
go test -snapshot
If you register test.snapshot you must run:
go test -test.snapshot
There is no automatic collapsing just because the name starts with test..
The flag -args lets you pass additional arguments to the test binary. When the binary sees
-h after -args, it prints every flag and exits. No tests run, though the binary is
built. That one command exposes the full configuration surface of your tests.
If you namespace your flags like this:
flag.BoolVar(&snapshot, "custom.snapshot", false, "run snapshot tests")
Then you can grep for them:
go test -v -args -h | grep custom
Define the global flags in TestMain when several files need the same switches or when you
have package-wide setup (containers, databases, global mocks). Define flags in init() when
a switch is relevant to one test file and you want the declaration right next to the logic
it controls. I usually prefer per-test- file-level flags that don’t need to depend on any
global magic.
Either way, the flag lives in code, is easy to grep, appears in -h, and tells everyone
exactly what it controls. The only downside I can think of with this approach is that,
similar to the environment variable technique, you’ll have to check for the flag in every
test and make a decision. But in practice, I prefer the flexibility over the all-or-nothing
approach with build tags.
I think flags are the best way to configure your apps and tools. Even when environment
variables are involved, I often map them to flags for documentation purposes. The goal is to
give users a single -h command they can run to see all available options for tuning
behavior. Tests are no exception. I was quite happy to find out that Peter Bourgon conveyed
the same sentiment in this seminal 2018 blog post.
Dependency Injection is a 25-dollar term for a 5-cent concept.
– James Shore
DI basically means passing values into a constructor instead of creating them inside it. That’s really it. Observe:
type server struct {
db DB
}
// NewServer constructs a server instance
func NewServer() *server {
db := DB{} // The dependency is created here
return &server{db: db}
}
Here, NewServer creates its own DB. Instead, to inject the dependency, build DB
elsewhere and pass it in as a constructor parameter:
func NewServer(db DB) *server {
return &server{db: db}
}
Now the constructor no longer decides how a database is built; it simply receives one.
In Go, DI is often done using interfaces. You collate the behavior you care about in an
interface, and then provide different concrete implementations for different contexts. In
production, you pass a real implementation of DB. In unit tests, you pass a fake
implementation that behaves the same way from the caller’s perspective but avoids real
database calls.
Here’s how that looks:
// behaviour we care about
type DB interface {
Get(id string) (string, error)
Save(id, value string) error
}
type server struct{ db DB }
// NewServer accepts a DB implementation and passes it to server
func NewServer(db DB) *server { return &server{db: db} }
A real implementation of DB might look like this:
type RealDB struct{ url string }
func NewDB(url string) *RealDB { return &RealDB{url: url} }
func (r *RealDB) Get(id string) (string, error) {
// pretend we hit Postgres
return "real value", nil
}
func (r *RealDB) Save(id, value string) error { return nil }
And a fake implementation for unit tests might be:
type FakeDB struct{ data map[string]string }
func NewFake() *FakeDB { return &FakeDB{data: map[string]string{}} }
func (f *FakeDB) Get(id string) (string, error) {
return f.data[id], nil
}
func (f *FakeDB) Save(id, value string) error {
f.data[id] = value
return nil
}
Use the fake in unit tests like so:
func TestServerGet(t *testing.T) {
fake := NewFake()
_ = fake.Save("42", "fake")
srv := NewServer(fake)
val, _ := srv.db.Get("42")
if val != "fake" {
t.Fatalf("want fake, got %s", val)
}
}
The compiler guarantees both RealDB and FakeDB satisfy DB, and during tests, we can
swap out the implementations without much ceremony.
Once NewServer grows half a dozen dependencies, wiring them by hand can feel noisy. That’s
when a DI framework starts looking tempting.
With Uber’s dig, you register each constructor as a provider. Provide takes a
function, uses reflection to inspect its parameters and return type, and adds it as a node
in an internal dependency graph. Nothing is executed yet. Things only run when you call
.Invoke() on the container.
But that reflection-driven magic is also where the pain starts. As your graph grows, it gets harder to tell which constructor feeds which one. Some constructor takes one parameter, some takes three. There’s no single place you can glance at to understand the wiring. It’s all figured out inside the container at runtime.
Let the container figure it out!
– every DI framework ever
func BuildContainer() *dig.Container {
c := dig.New()
// Each Provide call teaches dig about one node in the graph.
c.Provide(NewConfig) // produces *Config
c.Provide(NewDB) // wants *Config, produces *DB
c.Provide(NewRepo) // wants *DB, produces *Repo
c.Provide(NewFlagClient) // produces *FlagClient
c.Provide(NewService) // wants *Repo, *FlagClient, produces *Service
c.Provide(NewServer) // wants *Service, produces *server
return c
}
func main() {
// Invoke starts the graph; dig sorts and calls constructors
if err := BuildContainer().Invoke(
func(s *server) { s.Run() }); err != nil {
panic(err)
}
}
Now try commenting out NewFlagClient. The code still compiles. There’s no error until
runtime, when dig fails to construct NewService due to a missing dependency. And the error
message you get?
dig invoke failed: could not build arguments for function
main.main.func1 (prog.go:87)
: failed to build *main.Server
: could not build arguments for function main.NewServer (prog.go:65)
: failed to build *main.Service: missing dependencies for function
main.NewService (prog.go:55)
: missing type: *main.FlagClient
That’s five stack frames deep, far from where the problem started. Now you’re digging through dig’s internals to reconstruct the graph in your head.
Google’s wire takes a different approach: it shifts the graph-building to code
generation. You collect your constructors in a wire.NewSet, call wire.Build, and the
generator writes a wire_gen.go that wires everything up explicitly.
var serverSet = wire.NewSet(
NewConfig,
NewDB,
NewRepo,
NewFlagClient, // comment out to see Wire complain
NewService,
NewServer,
)
func InitializeServer() (*server, error) {
wire.Build(serverSet)
return nil, nil // replaced by generated code
}
Comment out NewFlagClient and Wire fails earlier - during generation:
wire: ../../service/wire.go:13:2: cannot find dependency for *flags.Client
It’s better than dig’s runtime panic, but still comes with its own headaches:
go generate ./... whenever constructor signatures change.wire.NewSet, wire.Build, build tags, and
sentinel rules. And if you ever switch to something different like dig, you’ll need to
learn a completely different set of concepts: Provide, Invoke, scopes, named values,
etc.While DI frameworks tend to use vocabularies like provider or container to give you an essense of familiarity, they still reinvent the API surface every time. Switching between them means relearning a new mental model.
So the promise of “just register your providers and forget about wiring” ends up trading clear, compile-time control for either reflection or hidden generator logic - and yet another abstraction layer you have to debug.
In Go, you can just wire your own dependencies manually. Like this:
func main() {
cfg := NewConfig()
db := NewDB(cfg.DSN)
repo := NewRepo(db)
flags := NewFlagClient(cfg.FlagURL)
svc := NewService(repo, flags, cfg.APIKey)
srv := NewServer(svc, cfg.ListenAddr)
srv.Run()
}
Longer? Yes. But:
The call order is the dependency graph.
Errors are handled right where they happen.
If a constructor changes, the compiler points straight at every broken call:
./main.go:33:39: not enough arguments in call to NewService
have (*Repo, *FlagClient)
want (*Repo, *FlagClient, string)
No reflection, no generated code, no global state. Go type-checks the dependency graph early and loudly, exactly how it should be. And also, it doesn’t confuse your LSP, so your IDE keeps on being useful.
If main() really grows unwieldy, split your code:
func buildInfra(cfg *Config) (*DB, *FlagClient, error) {
// ...
}
func buildService(cfg *Config) (*Service, error) {
db, flags, err := buildInfra(cfg)
if err != nil { return nil, err }
return NewService(NewRepo(db), flags, cfg.APIKey), nil
}
func main() {
cfg := NewConfig()
svc, err := buildService(cfg)
if err != nil { log.Fatal(err) }
NewServer(svc, cfg.ListenAddr).Run()
}
Each helper is a regular function that anyone can skim without reading a framework manual. Also, you usually build all of your dependency in one place and it’s really not that big of a deal if your builder function takes in 20 parameters and builds all the dependencies. Just put each function parameter on their own line and use gofumpt to format the code to make it readable.
Other languages lean on containers because often times constructors cannot be overloaded and compile times hurt. Go already gives you:
A DI framework often fixes problems Go already solved and trades away readability to do it.
The most magical thing about Go is how little magic it allows.
– Some Gopher on Reddit
It’s tempting to make a blanket statement saying that you should never pick up a DI framework, but context matters here.
I was watching Uber’s GopherCon talk on Go at scale and how their DI framework Fx (which uses dig underneath) allows them to achieve consistency at scale. If you’re Uber and have all the observability tools in place to get around the downsides, then you’ll know.
Also, if you’re working in a codebase that’s already leveraging a framework and it works well, then it doesn’t make sense to refactor it without any incentives.
Or, you’re writing one of those languages where using a DI framework is the norm, and you’ll be called a weirdo if you try to reinvent the wheel there.
However, in my experience, even in organizations that maintain a substantial number of Go repos, DI frameworks add more confusion than they’re worth. If your experience is otherwise, I’d love to be proven wrong.
The post got a fair bit of discussion going around the web. You might find it interesting.
]]>Take this example: passing a sync.WaitGroup by value will break things in subtle ways:
func f(wg sync.WaitGroup) {
// ... do something with the waitgroup
}
func main() {
var wg sync.WaitGroup
f(wg) // oops! wg is getting copied here!
}
sync.WaitGroup lets you wait for multiple goroutines to finish some work. Under the hood,
it’s a struct with methods like Add, Done, and Wait to sync concurrently running
goroutines.
That snippet compiles fine but leads to buggy behavior because we’re copying the lock
instead of referencing it in the f function.
Luckily, go vet catches it. If you run vet on that code, you’ll get a warning like this:
f passes lock by value: sync.WaitGroup contains sync.noCopy
call of f copies lock value: sync.WaitGroup contains sync.noCopy
This means we’re passing wg by value when we should be passing a reference. Here’s the
fix:
func f(wg *sync.WaitGroup) { // pass by reference
// ... do something with the waitgroup
}
func main() {
var wg sync.WaitGroup
f(&wg) // pass a pointer to wg
}
Since this kind of incorrect copy doesn’t throw a compile-time error, if you skip go vet,
you might never catch it. Another reason to always vet your code.
I was curious how the Go toolchain enforces this. The clue is in the vet warning:
call of f copies lock value: sync.WaitGroup contains sync.noCopy
So the sync.noCopy struct inside sync.WaitGroup is doing something to alert go vet
when you pass it by value.
Looking at the implementation of sync.WaitGroup, you’ll see:
type WaitGroup struct {
noCopy noCopy
state atomic.Uint64
sema uint32
}
Then I traced the definition of noCopy in sync/cond.go:
// noCopy may be added to structs which must not be copied
// after the first use.
// Note that it must not be embedded, due to the Lock and Unlock methods.
type noCopy struct{}
// Lock is a no-op used by -copylocks checker from `go vet`.
func (*noCopy) Lock() {}
func (*noCopy) Unlock() {}
Just having those no-op Lock and Unlock methods on noCopy is enough. This implements
the Locker interface. Then if you put that struct inside another one, go vet will flag
cases where you try to copy the outer struct.
Also, note the comment: don’t embed noCopy. Include it explicitly. Embedding would
expose Lock and Unlock on the outer struct, which you probably don’t want.
The Go toolchain enforces this with the copylock checker. It’s part of go vet. You can
exclusively invoke it with go vet -copylocks ./.... It looks for value copies of any
struct that nests a struct with Lock and Unlock methods. It doesn’t matter what those
methods do, just having them is enough.
When vet runs, it walks the AST and applies the checker on assignments, function calls,
return values, struct literals, range loops, channel sends, basically anywhere values can
get copied. If it sees you copying a struct with noCopy, it yells.
Interestingly, if you define noCopy as anything other than a struct and implement the
Locker interface, vet ignores that. I tested this on Go 1.24:
type noCopy int // this is valid but vet doesn't get triggered
func (*noCopy) Lock() {}
func (*noCopy) Unlock() {}
This doesn’t trigger vet. It only works when noCopy is a struct. The reason is that vet
takes a shortcut in the copylock checker when deciding whether to trigger the warning.
Currently, it explicitly looks for a struct that satisfies the Locker interface and
ignores any other type even if it implements the interface.
You’ll see this in other parts of the sync package too. sync.Mutex uses the same trick:
type Mutex struct {
_ noCopy
mu isync.Mutex
}
Same with sync.Once:
type Once struct {
done uint32
m Mutex
noCopy noCopy
}
Here’s a complete example of abusing -copylocks to prevent copying our own struct:
type Svc struct{ _ noCopy }
type noCopy struct{}
func (*noCopy) Lock() {}
func (*noCopy) Unlock() {}
// Use this
func main() {
var svc Svc
_ = svc // go vet will complain about this copy op
}
Running go vet on this gives:
assignment copies lock value to s: play.Svc contains play.noCopy
call of fmt.Println copies lock value: play.Svc contains play.noCopy
Someone on Reddit asked me what actually triggers the copylock checker in go vet - is it
the struct’s literal name noCopy or the fact that it implements the Locker interface?
The name noCopy isn’t special. You can call it whatever you want. As long as it implements
the Locker interface, go vet will complain if the surrounding struct gets copied. See
this Go Playground snippet.
tool directive that makes it easier to manage your project’s tooling.
I used to rely on Make targets to install and run tools like stringer, mockgen, and
linters like gofumpt, goimports, staticcheck, and errcheck. Problem is, these
installations were global, and they’d often clash between projects.
Another big issue was frequent version mismatch. I ran into cases where people were formatting the same codebase differently because they had different versions of the tools installed. Then CI would yell at everyone because it was always installing the latest version of the tools before running them. Chaos!
tools.go conventionTo avoid this mess, the Go community came up with a convention where you’d pin your tool
versions in a tools.go file. I’ve written about omitting dev dependencies before. But
the gist is, you’d have a tools.go file in your root directory that imports the tooling
and assigns them to _:
//go:build tools
// tools.go
package tools
import (
_ "github.com/golangci/golangci-lint/cmd/golangci-lint"
_ "mvdan.cc/gofumpt"
)
Since these dependencies aren’t used directly in the codebase, the //go:build tools
directive ensures they’re excluded from the main build.
Then running go mod tidy keeps things clean and includes these dev dependencies in the
go.mod and go.sum files.
This works, but it always felt a bit clunky. You end up polluting your main go.mod with
tooling-only dependencies. And sometimes, transitive dependencies of those tools clash with
your app’s dependencies.
The new tool directive in Go 1.24 solves some of the tools.go pain points.
tool directiveWith Go 1.24, you can now add tooling with the -tool flag when using go get:
go get -tool github.com/golangci/golangci-lint/cmd/golangci-lint@latest
This adds the dependency to your go.mod like this:
module github.com/rednafi/foo
go 1.24.2
tool github.com/golangci/golangci-lint/cmd/golangci-lint
// ... other transitive dependencies
Notice the tool directive clearly separates these from regular module dependencies.
Then you can run the tool with:
go tool golangci-lint run ./...
One thing to keep in mind: the first time you run a tool this way, it might take a second - Go needs to compile it before running if it isn’t already compiled. After that, it’s cached, so subsequent runs are fast.
go generate?This also plays nicely with go generate. I’ve started replacing direct tool calls with
go tool, so contributors don’t need to install tools globally. Just run go generate and
you’re done:
//go:generate go tool stringer -type=MyEnum
No further setup needed, no path issues, and it’s always using the version you pinned.
That said, one thing still bugs me: go get -tool adds these dev tools to the main go.mod
file. That means your application and dev dependencies are still mixed together. Same
problem the tools.go hack had.
There’s no built-in way to avoid this yet. So your options are:
go.mod file.tools module to isolate your tooling. A bit clunky, but doable.I went with the second option.
My layout looks like this:
.
├── go.mod
├── go.sum
└── tools
└── go.mod
Then I install tools like this:
cd tools
go get -tool github.com/golangci/golangci-lint/cmd/golangci-lint@latest
And run them from the root directory as follows:
go tool -modfile tools/go.mod golangci-lint run ./...
The go tool command supports a -modfile flag that you can use to specify where to pull
the tool version from. I really wish go get supported -modfile too - that way you
wouldn’t need to manage the dependencies in such a wonky manner. This was close to being
perfect. Well, maybe in a future release.
Another limitation is that it only works with tools written in Go. So if you’re using stuff
like eslint, prettier, or jq, you’re on your own. But for most of my projects, the dev
tooling is written in Go anyway, so this setup has been working okay.
io.Writer and
write to it instead. However, it’s common to encounter functions like this:
func frobnicate() {
fmt.Println("do something")
}
This would be easier to test if frobnicate would ask for a writer to write to. For
instance:
func frobnicate(w io.Writer) {
fmt.Fprintln(w, "do something")
}
You could pass os.Stdout to frobnicate explicitly to write to the console:
func main() {
frobnicate(os.Stdout)
}
This behaves exactly the same way as the first version of frobnicate.
During test, instead of os.Stdout, you’d just pass a bytes.Buffer and assert its content
as follows:
func TestFrobnicate(t *testing.T) {
// Create a buffer to capture the output
var buf bytes.Buffer
// Call the function with the buffer
frobnicate(&buf)
// Check if the output is as expected
expected := "do something\n"
if buf.String() != expected {
t.Errorf("Expected %q, got %q", expected, buf.String())
}
}
This is all good. But many functions or methods that emit logs just do that directly to
stdout. So we want to test the first version of frobnicate without making any changes to
it.
I found this neat pattern to test functions that write to stdout without accepting a writer.
The idea is to write a helper function named captureStdout that looks like this:
// captureStdout replaces os.Stdout with a buffer and returns it.
func captureStdout(f func()) string {
old := os.Stdout
r, w, _ := os.Pipe()
os.Stdout = w
f() // run the function that writes to stdout
_ = w.Close()
var buf bytes.Buffer
_, _ = io.Copy(&buf, r)
os.Stdout = old
return buf.String()
}
Here’s what’s happening under the hood:
We use os.Pipe() to create a pipe: a connected pair of file descriptors - a reader (r)
and a writer (w). Think of it like a temporary tunnel. Whatever we write to w, we can
read back from r. Since both are just files as far as Go is concerned, we can temporarily
replace os.Stdout with the writer end of the pipe:
os.Stdout = w
This means anything printed to stdout during the function run actually goes into our pipe. After the function runs, we close the writer to signal that we’re done writing, then read from the reader into a buffer and restore the original stdout.
Now we can test frobnicate without touching its implementation:
func TestFrobnicate(t *testing.T) {
output := captureStdout(func() {
frobnicate()
})
expected := "do something\n"
if output != expected {
t.Errorf("Expected %q, got %q", expected, output)
}
}
No need to refactor frobnicate. This works great for quick tests when you don’t control
the code or just want to assert some printed output.
The above version of captureStdout works fine for simple cases. But in practice, functions
might also write to stderr, especially if they’re using Go’s log package or if a panic
happens. For example, this would not be captured by the simple captureStdout helper:
log.Println("something went wrong")
Even though it looks like a normal print statement, log writes to stderr by default. So
if you want to catch that output too, or generally capture everything that’s printed to the
console during a function call, we need to upgrade our helper a bit. I found this example
from immudb’s captureOutput helper.
Here’s a more complete version:
// captureOut captures both stdout and stderr.
func captureOut(f func()) string {
// Create a pipe to capture stdout
custReader, custWriter, err := os.Pipe()
if err != nil {
panic(err)
}
// Save the original stdout and stderr to restore later
origStdout := os.Stdout
origStderr := os.Stderr
// Restore stdout and stderr when done
defer func() {
os.Stdout = origStdout
os.Stderr = origStderr
}()
// Set the stdout and stderr to the pipe
os.Stdout, os.Stderr = custWriter, custWriter
log.SetOutput(custWriter)
// Create a channel to read the output from the pipe
out := make(chan string)
// Goroutine reads from pipe and sends output to channel
var wg sync.WaitGroup
wg.Add(1)
go func() {
var buf bytes.Buffer
wg.Done()
io.Copy(&buf, custReader)
out <- buf.String()
}()
wg.Wait()
// Call the function that writes to stdout
f()
// Close the writer to signal that we're done
_ = custWriter.Close()
// Wait for the goroutine to finish reading from the pipe
return <-out
}
This version does a few more things:
Captures everything: It redirects both os.Stdout and os.Stderr to ensure all
standard output streams are captured. It also explicitly redirects the standard log
package’s output, which often bypasses os.Stderr.
Prevents deadlocks: Output is read concurrently in a separate goroutine. This is
crucial because if f generates more output than the internal pipe buffer can hold,
writing would block without a concurrent reader, causing a deadlock.
Ensure reader readiness: A sync.WaitGroup guarantees the reading goroutine is active
before f starts executing. This prevents a potential race condition where initial output
could be lost if f writes before the reader is ready.
Guarantees cleanup: Using defer, the original os.Stdout and os.Stderr are always
restored, even if f panics. This prevents the function from permanently altering the
program’s standard output streams.
You’d use captureOut the same way as the naive captureStdout. This version is safer and
more complete, and works well when you’re testing CLI commands, log-heavy code, or anything
that might write to the terminal in unexpected ways.
It’s not a replacement for writing functions that accept io.Writer, but when you’re
dealing with existing code or want to quickly assert on terminal output, it gets the job
done.
You can’t run the teardown inside the helper itself because the test still needs the setup.
For example, in the following case, the helper runs its teardown immediately:
func TestFoo(t *testing.T) {
helper(t)
// Test logic here: resources may already be cleaned up!
}
func helper(t *testing.T) {
t.Helper()
// Setup code here.
// Teardown code here.
defer func() {
// Clean up something.
}()
}
When helper is called, it defers its teardown - which executes at the end of the helper
function, not the test. But the test logic still depends on whatever the helper set up. So
this approach doesn’t work.
The next working option is to move the teardown logic into the test itself:
func TestFoo(t *testing.T) {
helper(t)
// Run the teardown of helper.
defer func() {
// Clean up something.
}()
// Test logic here.
}
func helper(t *testing.T) {
t.Helper()
// Setup code here.
// No teardown here; we move it to the caller.
}
This works fine if you have only one helper. But with multiple helpers, it quickly becomes messy - you now have to manage multiple teardown calls manually, like this:
func TestFoo(t *testing.T) {
helper1(t)
helper2(t)
defer func() {
// Clean up helper2.
}()
defer func() {
// Clean up helper1.
}()
// Test logic here.
}
You also need to be careful with the order: defer statements are executed in LIFO
(last-in, first-out) order. So if teardown order matters, this can be a problem. Ideally,
your tests shouldn’t depend on teardown order - but sometimes they do.
So rather than manually handling cleanup inside the test, have helpers return a teardown
function that the test can defer itself. Here’s how:
func TestFoo(t *testing.T) {
teardown1 := helper1(t)
defer teardown1()
teardown2 := helper2(t)
defer teardown2()
// Test logic here.
}
func helper1(t *testing.T) func() {
t.Helper()
// Setup code here.
// Maybe create a temp dir, start a mock server, etc.
return func() {
// Teardown code here.
}
}
func helper2(t *testing.T) func() {
t.Helper()
// Setup code here.
return func() {
// Teardown code here.
}
}
Each helper is self-contained: it sets something up and returns a function to clean up
whatever resource it has spun up. The test controls when teardown happens by calling the
cleanup function at the appropriate time. Another benefit is that the returned teardown
closure has access to the local variables of the helper. So func() can access the helper’s
*testing.T without us having to pass it explicitly as a parameter.
Here’s how I’ve been using this pattern.
The setupTempFile helper creates a temporary file, writes some content to it, and returns
the file name along with a teardown function that removes the file.
func setupTempFile(t *testing.T, content string) (string, func()) {
t.Helper()
tmpFile, err := os.CreateTemp("", "temp-*.txt")
if err != nil {
t.Fatalf("failed to create temp file: %v", err)
}
if _, err := tmpFile.WriteString(content); err != nil {
t.Fatalf("failed to write to temp file: %v", err)
}
tmpFile.Close()
return tmpFile.Name(), func() {
if err := os.Remove(tmpFile.Name()); err != nil {
t.Errorf("failed to remove temp file %s: %v",
tmpFile.Name(), err)
} else {
t.Logf("cleaned up temp file: %s", tmpFile.Name())
}
}
}
In the main test:
func TestReadFile(t *testing.T) {
path, cleanup := setupTempFile(t, "hello world")
defer cleanup()
data, err := os.ReadFile(path)
if err != nil {
t.Fatalf("failed to read file: %v", err)
}
t.Logf("file contents: %s", data)
}
Running the test displays:
=== RUN TestReadFile
prog_test.go:18: file contents: hello world
prog_test.go:38: cleaned up temp file: /tmp/temp-30176446.txt
--- PASS: TestReadFile (0.00s)
PASS
Sometimes you want to test code that makes HTTP calls. Here’s a helper that starts an in-memory mock server and returns its URL and a cleanup function that shuts it down:
func setupMockServer(t *testing.T) (string, func()) {
t.Helper()
handler := http.HandlerFunc(
func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Write([]byte("mock response"))
},
)
server := httptest.NewServer(handler)
return server.URL, func() {
server.Close()
t.Log("mock server shut down")
}
}
And in the test:
func TestHTTPRequest(t *testing.T) {
url, cleanup := setupMockServer(t)
defer cleanup()
resp, err := http.Get(url)
if err != nil {
t.Fatalf("failed to make HTTP request: %v", err)
}
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
t.Logf("response body: %s", body)
}
Running the test prints:
=== RUN TestHTTPRequest
prog_test.go:34: response body: mock response
prog_test.go:20: mock server shut down
--- PASS: TestHTTPRequest (0.00s)
PASS
In tests that hit a real (or test) database, you often need to create and drop tables. Here’s a helper that sets up a test table and returns a teardown function to drop it:
func setupTestTable(t *testing.T, db *sql.DB) func() {
t.Helper()
query := `CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY,
name TEXT
)`
_, err := db.Exec(query)
if err != nil {
t.Fatalf("failed to create table: %v", err)
}
return func() {
_, err := db.Exec(`DROP TABLE IF EXISTS users`)
if err != nil {
t.Errorf("failed to drop table: %v", err)
} else {
t.Log("dropped test table")
}
}
}
And the test:
func TestInsertUser(t *testing.T) {
db := getTestDB(t) // Opens test DB; defined elsewhere
cleanup := setupTestTable(t, db)
defer cleanup()
_, err := db.Exec(`INSERT INTO users (name) VALUES (?)`, "Alice")
if err != nil {
t.Fatalf("failed to insert user: %v", err)
}
}
P.S. I learned about this after the blog went live.
Go 1.14 added the t.Cleanup() method, which lets you avoid returning the teardown closures
from helper functions altogether. It also runs the cleanup logic in the correct order
(LIFO). So, you could rewrite the first example in this post as follows:
func TestFoo(t *testing.T) {
// The testing package will ensure that the cleanup runs at the end of
// this test function.
helper(t)
// Test logic here.
}
func helper(t *testing.T) {
t.Helper()
// We register the teardown logic with t.Cleanup().
t.Cleanup(func() {
// Teardown logic here.
})
}
Now the testing package will handle calling the cleanup logic in the correct order. You
can add multiple teardown functions like this:
t.Cleanup(func() {})
t.Cleanup(func() {})
The functions will run in LIFO order. Similarly, the database setup example can be rewritten like this:
func setupTestTable(t *testing.T, db *sql.DB) func() {
t.Helper()
// Logic as before.
// Instead of returning the teardown function, we register
// it with t.Cleanup().
t.Cleanup(func() {
_, err := db.Exec(`DROP TABLE IF EXISTS users`)
if err != nil {
t.Errorf("failed to drop table: %v", err)
} else {
t.Log("dropped test table")
}
})
}
Then the helper function is used like this:
func TestInsertUser(t *testing.T) {
db := getTestDB(t) // Opens a test DB connection; defined elsewhere.
// This sets up the DB, and t.Cleanup will execute the teardown
// logic once this test function finishes.
setupTestTable(t, db)
// Rest of the test logic.
}
Fin!
]]>sort.Interface to sort the elements in a slice. Later, Go
1.8 introduced sort.Slice to reduce boilerplate with inline comparison functions. Most
recently, Go 1.21 brought generic sorting via the slices package, which offers a concise
syntax and compile-time type safety.
These days, I mostly use the generic sorting syntax, but I wanted to document all three approaches for posterity.
The oldest technique is based on sort.Interface. You create a custom type that wraps your
slice and implement three methods - Len, Less, and Swap - to satisfy the interface.
Then you pass this custom type to sort.Sort().
The following example defines an IntSlice type. Passing an IntSlice to sort.Sort
arranges its integers in ascending order:
import (
"fmt"
"sort"
)
// Define a custom IntSlice so that we can implement the sort.Interface
type IntSlice []int
// Len, Less, Swap are required to conform to sort.Interface
func (s IntSlice) Len() int { return len(s) }
func (s IntSlice) Less(i, j int) bool { return s[i] < s[j] }
func (s IntSlice) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
func main() {
nums := IntSlice{4, 1, 3, 2}
sort.Sort(nums)
fmt.Println(nums) // [1 2 3 4]
}
To reverse the order, invert the comparison in the Less method and define a new type:
import (
"fmt"
"sort"
)
// Define a custom IntSlice for descending order sorting.
type DescIntSlice []int
func (s DescIntSlice) Len() int { return len(s) }
// Inverted comparison for descending order
func (s DescIntSlice) Less(i, j int) bool { return s[i] > s[j] }
func (s DescIntSlice) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
func main() {
nums := DescIntSlice{4, 1, 3, 2}
sort.Sort(nums)
fmt.Println(nums) // [4 3 2 1]
}
Just reversing the order requires you to define a separate type and implement the three methods again!
Luckily, for the basic types, the sort package provides sort.IntSlice,
sort.Float64Slice, and sort.StringSlice - which already implement sort.Interface. So
you don’t have to do the above for sorting a slice of primitive elements. Instead, you can
do this:
ints := sort.IntSlice{4, 1, 3, 2}
floats := sort.Float64Slice{3.1, 2.7, 5.0}
strings := sort.StringSlice{"banana", "apple", "cherry"}
sort.Sort(ints) // ints: [1 2 3 4]
sort.Sort(floats) // floats: [2.7 3.1 5]
sort.Sort(strings) // strings: [apple banana cherry]
To reverse the order, you can use sort.Reverse as follows:
sort.Sort(sort.Reverse(ints)) // ints: [4 3 2 1]
sort.Sort(sort.Reverse(floats)) // floats: [5 3.1 2.7]
sort.Sort(sort.Reverse(strings)) // strings: [cherry banana apple]
However, if you’re dealing with a slice of structs, then you do have to implement
sort.Interface manually. Here, we sort by the Age field in ascending order:
import (
"fmt"
"sort"
)
type User struct {
Name string
Age int
}
type ByAge []User
func (s ByAge) Len() int { return len(s) }
func (s ByAge) Less(i, j int) bool { return s[i].Age < s[j].Age }
func (s ByAge) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
func main() {
users := ByAge{
{"Alice", 32},
{"Bob", 27},
{"Carol", 40},
}
sort.Sort(users)
fmt.Println(users) // [{Bob 27} {Alice 32} {Carol 40}]
}
We can leverage sort.Reverse to reverse the order:
sort.Sort(sort.Reverse(users)) // [{Carol 40} {Alice 32} {Bob 27}]
Although sort.Interface can handle just about any sorting logic, you must create a new
custom type (or significantly modify an existing one) each time you want to sort a different
slice or the same slice in a different way. It’s powerful but verbose, and can be cumbersome
to maintain if you have many different sorts in your code.
Go 1.8 introduced sort.Slice to minimize the amount of boilerplate needed for sorting.
Instead of creating a new type and implementing three methods, you provide an inline
comparison function that receives the two indices you’re comparing.
Here’s a simple example that sorts floats in ascending order:
import (
"fmt"
"sort"
)
func main() {
floats := []float64{2.5, 0.1, 3.9, 1.2}
sort.Slice(floats, func(i, j int) bool {
return floats[i] < floats[j]
})
fmt.Println(floats) // [0.1 1.2 2.5 3.9]
}
Inverting the comparison sorts them in descending order:
import (
"fmt"
"sort"
)
func main() {
floats := []float64{2.5, 0.1, 3.9, 1.2}
sort.Slice(floats, func(i, j int) bool {
return floats[i] > floats[j] // Reverse the comp
})
fmt.Println(floats) // [3.9 2.5 1.2 0.1]
}
For structs, the inline comparator can access struct fields:
import (
"fmt"
"sort"
)
type User struct {
Name string
Age int
}
func main() {
users := []User{
{"Alice", 32},
{"Bob", 27},
{"Carol", 40},
}
sort.Slice(users, func(i, j int) bool {
return users[i].Age < users[j].Age
})
fmt.Println(users) // [{Bob 27} {Alice 32} {Carol 40}]
}
Switching > for < will reverse the sort:
import (
"fmt"
"sort"
)
type User struct {
Name string
Age int
}
func main() {
users := []User{
{"Alice", 32},
{"Bob", 27},
{"Carol", 40},
}
sort.Slice(users, func(i, j int) bool {
return users[i].Age > users[j].Age
})
fmt.Println(users) // [{Carol 40} {Alice 32} {Bob 27}]
}
While sort.Slice is much simpler than sort.Interface, it’s still not strictly type-safe:
the slice parameter is defined as an interface{}, and you provide a comparator that uses
indices. Go won’t necessarily stop you from doing something incorrect in the comparison at
compile time.
For example, this code compiles but will panic at runtime because other is referenced
inside the comparator of a different slice ints, and the indices i or j can go out of
bounds in other:
import (
"fmt"
"sort"
)
func main() {
ints := []int{3, 1, 2}
other := []int{10, 20}
sort.Slice(ints, func(i, j int) bool {
// Using 'other' here compiles, but i or j might be out of range.
return other[i] < other[j]
})
fmt.Println(ints)
}
You won’t find out you’ve made a mistake until runtime, when a panic occurs. There is no
compiler-enforced guarantee that the func(i, j int) bool actually compares two values of
the intended slice.
Note: In sort.Slice, the comparison function parameters i and j are indices.
Inside the function, you must reference slice[i] and slice[j] to get the actual elements
being compared.
Go 1.21 introduced the slices package, which provides generic sorting functions. These new
functions combine the convenience of sort.Slice with the ability to detect type errors at
compile time. For basic numeric or string slices that satisfy Go’s “ordered” constraints,
you can just call slices.Sort. For more complex or custom sorting, slices.SortFunc
accepts a comparator function that returns an integer (negative if a < b, zero if they’re
equal, and positive if a > b).
When you’re dealing with basic types like int, float64, or string, you can sort them
immediately using slices.Sort, which arranges them in ascending order:
import (
"fmt"
"slices"
)
func main() {
ints := []int{4, 1, 3, 2}
floats := []float64{2.5, 0.1, 3.9, 1.2}
slices.Sort(ints)
slices.Sort(floats)
fmt.Println(ints) // [1 2 3 4]
fmt.Println(floats) // [0.1 1.2 2.5 3.9]
}
For descending order, you can use slices.SortFunc and invert the usual comparison:
import (
"fmt"
"slices"
)
func main() {
ints := []int{4, 1, 3, 2}
floats := []float64{2.5, 0.1, 3.9, 1.2}
slices.SortFunc(ints, func(a, b int) int {
switch {
case a > b:
return -1
case a < b:
return 1
default:
return 0
}
})
slices.SortFunc(floats, func(a, b float64) int {
switch {
case a > b:
return -1
case a < b:
return 1
default:
return 0
}
})
fmt.Println(ints) // [4 3 2 1]
fmt.Println(floats) // [3.9 2.5 1.2 0.1]
}
When dealing with more complex structures, you can define precisely how two elements should be compared:
import (
"fmt"
"slices"
)
type User struct {
Name string
Age int
}
func main() {
users := []User{
{"Alice", 32},
{"Bob", 27},
{"Carol", 40},
}
slices.SortFunc(users, func(a, b User) int {
return a.Age - b.Age
})
fmt.Println(users) // [{Bob 27} {Alice 32} {Carol 40}]
}
To reverse the order, invert the numerical comparison:
import (
"fmt"
"slices"
)
type User struct {
Name string
Age int
}
func main() {
users := []User{
{"Alice", 32},
{"Bob", 27},
{"Carol", 40},
}
slices.SortFunc(users, func(a, b User) int {
switch {
case a.Age > b.Age:
return -1
case a.Age < b.Age:
return 1
default:
return 0
}
})
fmt.Println(users) // [{Carol 40} {Alice 32} {Bob 27}]
}
Note: Unlike sort.Slice, which passes indices to the comparison function,
slices.SortFunc passes the actual elements (a and b) to your comparator. Moreover,
the comparator must return an int (negative, zero, or positive), rather than a boolean.
One of the major benefits of the slices package is compile-time type safety, which you
don’t get with sort.Sort or sort.Slice. Those older APIs use interface{} parameters or
index-based comparators and don’t strictly verify that your comparator operates on the right
types.
As shown previously, you can accidentally reference a different slice in the comparator and
your code will compile but crash at runtime. By contrast, slices.Sort and
slices.SortFunc are fully generic. The compiler enforces that you pass a slice of a valid
type (e.g., []int, []string, or a custom struct slice), and that your comparator’s
signature matches the element type. This means you get errors at compile time instead of at
runtime.
For instance, if you attempt to pass an array instead of a slice:
import "slices"
func main() {
arr := [4]int{10, 20, 30, 40}
// compile-time error: cannot use arr (type [4]int) as []int
slices.Sort(arr)
}
Go will refuse to compile this code because arr is not a slice. Similarly, if your
comparator for slices.SortFunc returns a type other than int, the compiler will produce
an error. This helps you detect mistakes immediately, rather than discovering them in
runtime.
For a practical illustration, consider sorting a slice by a case-insensitive string field:
import (
"fmt"
"slices"
"strings"
)
type Animal struct {
Name string
Species string
}
func main() {
animals := []Animal{
{"Bob", "Giraffe"},
{"alice", "Zebra"},
{"Dave", "Elephant"},
}
// Sort by Name, ignoring case
slices.SortFunc(animals, func(a, b Animal) int {
aLower := strings.ToLower(a.Name)
bLower := strings.ToLower(b.Name)
switch {
case aLower < bLower:
return -1
case aLower > bLower:
return 1
default:
return 0
}
})
fmt.Println(animals)
// Output: [{alice Zebra} {Bob Giraffe} {Dave Elephant}]
}
Because your comparator expects an Animal for both a and b, you can’t accidentally
compare two different types or reference the wrong fields without hitting a compile-time
error.
true but got false instead.
Many moons ago, Russ Cox wrote a fantastic post on Go interface internals that clarified my confusion. This post is a distillation of my exploration of interfaces and nil comparisons.
Roughly speaking, an interface in Go has three components:
For example:
var n any // The static type of n is any (interface{})
n = 1 // Upon assignment, the dynamic type becomes int
// And the dynamic value becomes 1
Here, the static type of n is any, which tells the compiler what operations are allowed
on the variable. In the case of any, any operation is allowed. When we assign 1 to n,
it adopts the dynamic type int and the dynamic value 1.
Internally, every interface value is implemented as a two word structure:
This data word might directly contain the value if it’s small enough, or it might hold a
pointer to the actual data. Note that this internal representation is distinct from the
interface’s declared or “static” type - the type you wrote in the code (any in the example
above). At runtime, what gets stored is only the pair of dynamic type and dynamic value.
Here’s a crude diagram:
+-----------------------+
| Interface |
+-----------------------+
| Pointer to type info | ---> [Dynamic type descriptor]
+-----------------------+
| Data | ---> [Dynamic value or pointer to the value]
+-----------------------+
Nil comparisons can be tricky because an interface value is considered nil only when both its dynamic type and dynamic value are nil. A few examples.
var p *int // p is a nil pointer of type *int
if p == nil {
fmt.Println("p is nil")
}
// Output: p is nil
Here, p is a pointer to an int and is explicitly nil, so the comparison works as expected.
This doesn’t have anything to do with explicit interfaces, but it’s important to demo basic
nil comparison to understand how comparisons work with interfaces.
var r io.Reader // The static type of r is io.Reader
r = nil // The dynamic type is nil
// The dynamic value is nil
// Since both the dynamic type and value evaluate to nil, r == nil is true
if r == nil {
fmt.Println("r is nil")
}
// Output: r is nil
In this case, r is directly set to nil. Since both the dynamic type and the dynamic value
are nil, the interface compares equal to nil.
var b *bytes.Buffer // b is a nil pointer of type *bytes.Buffer
var r io.Reader = b // The static type of r is io.Reader.
// The dynamic type of r is *bytes.Buffer.
// The dynamic value of r is nil.
// Although b is nil, r != nil because r holds type info (*bytes.Buffer).
if r == nil {
fmt.Println("r is nil")
} else {
fmt.Println("r is not nil")
}
// Output: r is not nil
Even though b is nil, assigning it to the interface variable r gives r a non-nil
dynamic type (*bytes.Buffer) with a nil dynamic value. Since r still holds type
information, r == nil returns false, even though the underlying value is nil.
When comparing an interface variable, Go checks both the dynamic type and the value. The variable evaluates to nil only if both are nil.
In cases where an interface variable might hold a nil pointer, we’ve seen that comparing the interface directly to nil may not yield the expected result.
A type assertion can help extract the underlying value so that you can perform a more reliable nil check. This approach is especially useful when you know the expected underlying type.
Below, we define a simple type myReader that implements the Read method to satisfy the
io.Reader interface.
type myReader struct{}
func (mr *myReader) Read(p []byte) (int, error) {
return 0, nil
}
Now, consider the following example:
var mr *myReader // mr is a nil pointer of type *myReader
var r io.Reader = mr // The static type of r is io.Reader
// The dynamic type of r is *myReader
// The dynamic value of r is nil
// Use a type assertion to extract the underlying *myReader value.
if underlying, ok := r.(*myReader); ok && underlying == nil {
fmt.Println("r holds a nil pointer")
} else {
fmt.Println("r does not hold a nil pointer")
}
// Output: r holds a nil pointer
Here, we assert that r holds a value of type *myReader. If the assertion succeeds
(indicated by ok being true) and the underlying value is nil, we can conclude that
the interface variable holds a nil pointer - even though the interface itself is not nil due
to its dynamic type.
This type assertion trick only works when you know the underlying type of the interface value. If the type might vary, consider using the reflect package to examine the underlying value.
The following function introspects any variable and checks whether it’s nil:
func isNil(i any) bool {
if i == nil {
return true
}
// Arrays are not nilable, so we skip reflect.Array.
switch reflect.TypeOf(i).Kind() {
case reflect.Ptr,
reflect.Map,
reflect.Chan,
reflect.Slice,
reflect.Func:
return reflect.ValueOf(i).IsNil()
}
return false
}
The switch on .Kind() is necessary because directly calling reflect.ValueOf().IsNil() on
a non-pointer value will cause a panic.
Calling this function on any value, including an interface, reliably checks whether it’s nil.
Fin!
]]>/special to serve a custom
response.
However, I often find the indirections introduced by this pattern a bit hard to read and debug. I recently came across the embedded delegation pattern while browsing Gin’s HTTP router source code. Here, I explore both patterns and explain why I usually start with delegation whenever I need to modify HTTP requests in my Go services.
Here’s an example where the logging middleware records each request, and the special
middleware intercepts requests to /special:
package main
import (
"log"
"net/http"
)
// loggingMiddleware logs incoming requests.
func loggingMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
log.Println("Middleware: received request for", r.URL.Path)
next.ServeHTTP(w, r)
})
}
// specialMiddleware intercepts requests for "/special" and handles them.
func specialMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/special" {
w.Write([]byte("Special middleware handling request"))
return
}
next.ServeHTTP(w, r)
})
}
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, world!"))
})
// Middleware chain: special handling then logging.
handler := loggingMiddleware(specialMiddleware(mux))
http.ListenAndServe(":8080", handler)
}
In this setup, every incoming request is first handled by the special middleware, which
checks for the /special route, and then by the logging middleware that logs the request
details. We’re effectively stacking the middleware functions.
If you hit the server with:
curl localhost:8080/
curl localhost:8080/special
the server logs will look like this:
2025/03/06 21:24:44 Middleware: received request for /
2025/03/06 21:24:47 Middleware: received request for /special
Stacking middleware functions like middleware3(middleware2(middleware1(mux))) can get
messy when you have many of them. That’s why people usually write a wrapper function to
apply the middlewares to the mux:
func applyMiddleware(
handler http.Handler,
middlewares ...func(http.Handler) http.Handler) http.Handler {
// Apply middlewares in reverse order to preserve LIFO.
for i := len(middlewares) - 1; i >= 0; i-- {
handler = middlewares[i](handler)
}
return handler
}
applyMiddleware takes an http.Handler and a variadic list of middleware functions
(...func(http.Handler) http.Handler). It loops over the middleware in reverse order so
each one wraps the next properly. This avoids deep nesting like
middleware3(middleware2(middleware1(mux))) and keeps the middleware chain tidy.
You’d then use it like this:
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, world!"))
})
// Middleware chain: special handling then logging.
// specialMiddleware is applied before loggingMiddleware.
handler := applyMiddleware(mux, loggingMiddleware, specialMiddleware)
http.ListenAndServe(":8080", handler)
}
This behaves just like the manual middleware stacking, but it’s a bit cleaner.
While this is the canonical way to handle request-response modifications in Go, it can sometimes be hard to reason about, especially when debugging or dealing with many middleware layers.
There’s another way to achieve the same result without dealing with a soup of nested functions. The next section talks about that.
Embedded delegation (or the delegation pattern) means you embed the standard HTTP
multiplexer inside your own struct and override its ServeHTTP method.
It’s a bit like inheritance - overriding a method in a subclass to add extra functionality and then delegating the call to the original method. Although Go doesn’t have a class hierarchy, you can still delegate responsibilities to the embedded type’s method.
The following example implements the same behavior - logging every request and intercepting
the /special route - directly within a custom mux:
package main
import (
"log"
"net/http"
)
// CustomMux embeds http.ServeMux to override ServeHTTP.
type CustomMux struct {
*http.ServeMux
}
// ServeHTTP logs the request and intercepts "/special" before
// delegating to the embedded mux.
func (cm *CustomMux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// Log all requests.
log.Println("CustomMux: received request for", r.URL.Path)
// Handle "/special" differently.
if r.URL.Path == "/special" {
w.Write([]byte("Special handling in CustomMux"))
return
}
cm.ServeMux.ServeHTTP(w, r)
}
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, world!"))
})
// Wrap the standard mux with our custom delegation.
customMux := &CustomMux{ServeMux: mux}
http.ListenAndServe(":8080", customMux)
}
In this example, the custom mux centralizes both logging and special-case route handling
within one ServeHTTP method. This approach cuts out the extra function calls in a
middleware chain and can simplify tracking the request flow. I find it a bit easier on the
eyes too.
If you have a bunch of extra functionality to add inside cm.ServeHTTP, you can wrap them
in utility functions like this:
// logRequest logs incoming HTTP requests.
func logRequest(r *http.Request) {
log.Println("CustomMux: received request for", r.URL.Path)
}
// handleSpecialRequest handles requests to "/special"
// and returns true if handled.
func handleSpecialRequest(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path != "/special" {
return false // Not handled, continue processing.
}
w.Write([]byte("Special handling in CustomMux"))
return true // Handled; no further processing needed.
}
Then, simply call these functions inside your cm.ServeHTTP method:
func (cm *CustomMux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
logRequest(r)
if handleSpecialRequest(w, r) {
return
}
cm.ServeMux.ServeHTTP(w, r)
}
This keeps all the request modifications in a single ServeHTTP method.
You can also mix both techniques. For example, you might use direct delegation for special route handling and then wrap the resulting handler with middleware for logging. Here’s how a hybrid solution might look:
package main
import (
"log"
"net/http"
)
// CustomMux embeds http.ServeMux and intercepts "/special".
type CustomMux struct {
*http.ServeMux
}
// ServeHTTP intercepts "/special" and delegates other routes.
func (cm *CustomMux) ServeHTTP(w http.ResponseWriter, r *http.Request) {
if r.URL.Path == "/special" {
w.Write([]byte("Special handling in CustomMux"))
return
}
cm.ServeMux.ServeHTTP(w, r)
}
// loggingMiddleware logs incoming requests.
func loggingMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
log.Println("Middleware: received request for", r.URL.Path)
next.ServeHTTP(w, r)
})
}
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("Hello, world!"))
})
// Use direct delegation for special routing.
customMux := &CustomMux{ServeMux: mux}
// Wrap the custom mux with logging middleware.
handler := loggingMiddleware(customMux)
http.ListenAndServe(":8080", handler)
}
In this hybrid approach, the specialized behavior (intercepting the /special path) is
handled via direct delegation, while logging stays modular as middleware. This gives you the
best of both worlds.
I usually start with the embedded delegation and gradually introduce the middleware pattern if I need it later. It’s easier to adopt the middleware pattern if you start with delegation than the other way around.
]]>io.Reader a bit odd:
type Reader interface {
Read(p []byte) (n int, err error)
}
Why take a byte slice and write data into it? Wouldn’t it be simpler to create the slice
inside Read, load the data, and return it instead?
// Hypothetical; what I *thought* it should be
Read() (p []byte, err error)
This felt more intuitive to me - you call Read, and it gives you a slice filled with data,
no need to pass anything.
I found out why it’s designed this way while watching this excellent GopherCon Singapore talk on understanding allocations by Jacob Walker. It mainly boils down to two reasons.
If Read created and returned a new slice every time, the memory would always end up on the
heap.
Heap allocations are slower because they require garbage collection, while stack allocations
are faster since they are freed automatically when a function returns. By taking a
caller-provided slice, Read lets the caller control memory and reuse buffers, keeping them
on the stack whenever possible.
This matters a lot when reading large amounts of data. If each Read call created a new
slice, you’d constantly be allocating memory, leading to more work for the garbage
collector. Instead, the caller can allocate a buffer once and reuse it across multiple
reads:
buf := make([]byte, 4096) // Single allocation
n, err := reader.Read(buf) // Read into existing buffer
Go’s escape analysis tool (go build -gcflags=-m) can confirm this. If Read returned a
new slice, the tool would likely show:
buf escapes to heap
meaning Go has to allocate it dynamically. But by reusing a preallocated slice, we avoid unnecessary heap allocations - only if the buffer is small enough to fit in the stack. How small? Only the compiler knows, and you shouldn’t depend on it. Use the escape analysis tool to see that. But most of the time, you don’t need to worry about this at all.
The second issue is correctness. When reading from a stream, you usually call Read
multiple times to get all the data. If Read returned a fresh slice every time, you’d have
no control over memory usage across calls. Worse, you couldn’t efficiently handle partial
reads, making buffer management unpredictable.
With the hypothetical version of Read, every call would allocate a new slice. If you
needed to read a large stream of data, you’d have to manually piece everything together
using append, like this:
var allData []byte
for {
buf, err := reader.Read() // New allocation every call
if err != nil {
break
}
allData = append(allData, buf...) // Growing slice, more allocs
}
process(allData)
This is a mess. Every time append runs out of space, Go will have to allocate a larger
slice and copy the existing data over, piling on unnecessary GC pressure.
By contrast, io.Reader’s actual design avoids this problem:
buf := make([]byte, 4096) // Allocate once
for {
n, err := reader.Read(buf)
if err != nil {
break
}
process(buf[:n])
}
This avoids unnecessary allocations and produces less garbage for the GC to clean up.
]]>