Skip to content

feat: add cachecompress package to compress static files for HTTP#21915

Merged
spikecurtis merged 1 commit intomainfrom
spike/internal-1300-cachecompress-pkg
Feb 6, 2026
Merged

feat: add cachecompress package to compress static files for HTTP#21915
spikecurtis merged 1 commit intomainfrom
spike/internal-1300-cachecompress-pkg

Conversation

@spikecurtis
Copy link
Contributor

@spikecurtis spikecurtis commented Feb 4, 2026

relates to: coder/internal#1300

Adds a new package called cachecompress which takes a http.FileSystem and wraps it with an on-disk cache of compressed files. We lazily compress files when they are requested over HTTP.

Why we want this

With cached compress, we reduce CPU utilization during workspace creation significantly.

image.png

This is from a 2k scaletest at the top of this stack of PRs so that it's used to server /bin/ files. Previously we pegged the 4-core Coderds, with profiling showing 40% of CPU going to zstd compression (c.f. coder/internal#1300).

With this change compression is reduced down to 1s of CPU time (from 7 minutes).

Implementation details

The basic structure is taken from Chi's Compressor middleware. I've reproduced the LICENSE in the directory because it's MIT licensed, not AGPL like the rest of Coder.

I've structured it not as a middleware that calls an arbitrary upstream HTTP handler, but taking an explicit http.FileSystem. This is done for safety so we are only caching static files and not dynamically generated content with this.

One limitation is that on first request for a resource, it compresses the whole file before starting to return any data to the client. For large files like the Coder binaries, this can add 1-5 seconds to the time-to-first-byte, depending on the compression used.

I think this is reasonable: it only affects the very first download of the binary with a particular compression for a particular Coderd.

If we later find this unacceptible, we can fix it without changing interfaces. We can poll the file system to figure out how much data is available while the compression is inprogress.

Copy link
Contributor

@coder-tasks coder-tasks bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Reviewed the new cachecompress package that implements on-disk caching of compressed static files. This is a substantial performance improvement for workspace creation. Found several issues that should be addressed:

Critical Issues

  1. Binary profiling files committed - Five profiling files (coderd-fgprof.prof, mem.prof, test-build-info.prof, test-cpu.pprof, test-insert-file.prof) were accidentally included in the PR. These should be removed and added to .gitignore.
  2. Deprecated import - Using deprecated io/ioutil package in tests
  3. Missing context propagation - Context from request not passed to compression goroutines
  4. Unsafe concurrent map access - Potential race condition in cache lookup

Important Issues

  1. File handle management - Incorrect defer ordering could leak file handles or corrupt cached files
  2. Error handling - Missing error handling for encoder.Close()
  3. Cache cleanup - No mechanism to clean up old/unused cache files (not blocking, but should be tracked)

Positive Observations

  • Well-structured package with clear separation of concerns
  • Good use of sync.Pool for encoder reuse
  • Proper MIT license attribution from Chi
  • Comprehensive test coverage for encoding selection
  • Smart fallback to uncompressed on errors

The implementation is solid overall, but the critical issues (especially the binary files and race condition) should be fixed before merge.

@spikecurtis spikecurtis force-pushed the spike/internal-1300-cachecompress-pkg branch from 8eea2e0 to 05358a4 Compare February 4, 2026 10:19
@spikecurtis spikecurtis force-pushed the spike/internal-1300-cachecompress-pkg branch from 05358a4 to 3a2606c Compare February 4, 2026 11:50
@spikecurtis spikecurtis force-pushed the spike/internal-1300-cachecompress-pkg branch from 3a2606c to de03035 Compare February 4, 2026 11:50
Comment on lines +278 to +279
c.cache[ck] = cref
go c.compress(context.Background(), encoding, cref, r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that insertion prior to the compression being completed, then using the two channels to wait on the compression completion/erroring, is a mechanism for eliminating duplicate compression of the file, but to me that mechanism itself was a bit strange to read/follow.

My question would be whether we're serving 'new' files often enough that a some duplicated compression calls is that expensive, but if we assume it is then I think we can make use of the singleflight package as an easier to read mechanism for only allowing a single go routine to do compression when there is no cache entry, something like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question would be whether we're serving 'new' files often enough that a some duplicated compression calls is that expensive,

I mean, avoiding duplicate compression is the whole reason to do this.

I think we can make use of the singleflight package as an easier to read mechanism for only allowing a single go routine to do compression when there is no cache entry, something like this.

I have to say, I don't really find your singleflight based implementation to be an improvement. This could be just due to having spent a lot of time thinking about this implementation, but I'll try to offer what I think are the issues.

singleflight is designed to supress concurrent duplication of function calls. But we want to supress duplication even if they are not concurrent. You work around this by keeping the cache map around, with a mutex, and then have to lock and unlock it a few times before and during the singleflight. So we've got these two different lock domains, the map and the singleflight group, but they actually represent the same resource. Arguably, the singleflight is familiar and may make it easier to see what we are trying to do, but having the interacting map/mutex and singleflight group makes it much more difficult to determine whether it is correct in all cases.

In contrast, the concurrency logic in my implementation is much simpler: there is a single lock over a single keyspace. If the key doesn't exist when you look for it, then you insert it and are definitely the one that needs to kick off the compression. If the key does exist, you are definitely not the one that kicks off the compression. In either case, you wait until it is done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean re: single lock vs multiple locking. I agree with you, your implementation is simpler 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, avoiding duplicate compression is the whole reason to do this.

I meant the concurrent compression calls, IE thundering heard to try and create the cache entry, not questioning having the cache altogether 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the concurrent compression calls, IE thundering heard to try and create the cache entry, not questioning having the cache altogether 👍

Oh, that makes sense. I'm speculating, but I think we'd be alright from a CPU perspective, since it only takes a couple seconds to compress our binaries. The big issue though, is, how could you allow multiple writers to safely write to the same file? There are probably ways, but much simpler to only allow a single writer.

@spikecurtis spikecurtis merged commit 8aa9e9a into main Feb 6, 2026
24 of 26 checks passed
@spikecurtis spikecurtis deleted the spike/internal-1300-cachecompress-pkg branch February 6, 2026 06:13
@github-actions github-actions bot locked and limited conversation to collaborators Feb 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants