feat: add cachecompress package to compress static files for HTTP#21915
feat: add cachecompress package to compress static files for HTTP#21915spikecurtis merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Code Review
Reviewed the new cachecompress package that implements on-disk caching of compressed static files. This is a substantial performance improvement for workspace creation. Found several issues that should be addressed:
Critical Issues
- Binary profiling files committed - Five profiling files (
coderd-fgprof.prof,mem.prof,test-build-info.prof,test-cpu.pprof,test-insert-file.prof) were accidentally included in the PR. These should be removed and added to.gitignore. - Deprecated import - Using deprecated
io/ioutilpackage in tests - Missing context propagation - Context from request not passed to compression goroutines
- Unsafe concurrent map access - Potential race condition in cache lookup
Important Issues
- File handle management - Incorrect defer ordering could leak file handles or corrupt cached files
- Error handling - Missing error handling for
encoder.Close() - Cache cleanup - No mechanism to clean up old/unused cache files (not blocking, but should be tracked)
Positive Observations
- Well-structured package with clear separation of concerns
- Good use of sync.Pool for encoder reuse
- Proper MIT license attribution from Chi
- Comprehensive test coverage for encoding selection
- Smart fallback to uncompressed on errors
The implementation is solid overall, but the critical issues (especially the binary files and race condition) should be fixed before merge.
8eea2e0 to
05358a4
Compare
05358a4 to
3a2606c
Compare
3a2606c to
de03035
Compare
| c.cache[ck] = cref | ||
| go c.compress(context.Background(), encoding, cref, r) |
There was a problem hiding this comment.
I get that insertion prior to the compression being completed, then using the two channels to wait on the compression completion/erroring, is a mechanism for eliminating duplicate compression of the file, but to me that mechanism itself was a bit strange to read/follow.
My question would be whether we're serving 'new' files often enough that a some duplicated compression calls is that expensive, but if we assume it is then I think we can make use of the singleflight package as an easier to read mechanism for only allowing a single go routine to do compression when there is no cache entry, something like this.
There was a problem hiding this comment.
My question would be whether we're serving 'new' files often enough that a some duplicated compression calls is that expensive,
I mean, avoiding duplicate compression is the whole reason to do this.
I think we can make use of the singleflight package as an easier to read mechanism for only allowing a single go routine to do compression when there is no cache entry, something like this.
I have to say, I don't really find your singleflight based implementation to be an improvement. This could be just due to having spent a lot of time thinking about this implementation, but I'll try to offer what I think are the issues.
singleflight is designed to supress concurrent duplication of function calls. But we want to supress duplication even if they are not concurrent. You work around this by keeping the cache map around, with a mutex, and then have to lock and unlock it a few times before and during the singleflight. So we've got these two different lock domains, the map and the singleflight group, but they actually represent the same resource. Arguably, the singleflight is familiar and may make it easier to see what we are trying to do, but having the interacting map/mutex and singleflight group makes it much more difficult to determine whether it is correct in all cases.
In contrast, the concurrency logic in my implementation is much simpler: there is a single lock over a single keyspace. If the key doesn't exist when you look for it, then you insert it and are definitely the one that needs to kick off the compression. If the key does exist, you are definitely not the one that kicks off the compression. In either case, you wait until it is done.
There was a problem hiding this comment.
I see what you mean re: single lock vs multiple locking. I agree with you, your implementation is simpler 👍
There was a problem hiding this comment.
I mean, avoiding duplicate compression is the whole reason to do this.
I meant the concurrent compression calls, IE thundering heard to try and create the cache entry, not questioning having the cache altogether 👍
There was a problem hiding this comment.
I meant the concurrent compression calls, IE thundering heard to try and create the cache entry, not questioning having the cache altogether 👍
Oh, that makes sense. I'm speculating, but I think we'd be alright from a CPU perspective, since it only takes a couple seconds to compress our binaries. The big issue though, is, how could you allow multiple writers to safely write to the same file? There are probably ways, but much simpler to only allow a single writer.

relates to: coder/internal#1300
Adds a new package called
cachecompresswhich takes ahttp.FileSystemand wraps it with an on-disk cache of compressed files. We lazily compress files when they are requested over HTTP.Why we want this
With cached compress, we reduce CPU utilization during workspace creation significantly.
This is from a 2k scaletest at the top of this stack of PRs so that it's used to server
/bin/files. Previously we pegged the 4-core Coderds, with profiling showing 40% of CPU going tozstdcompression (c.f. coder/internal#1300).With this change compression is reduced down to 1s of CPU time (from 7 minutes).
Implementation details
The basic structure is taken from Chi's Compressor middleware. I've reproduced the
LICENSEin the directory because it's MIT licensed, not AGPL like the rest of Coder.I've structured it not as a middleware that calls an arbitrary upstream HTTP handler, but taking an explicit
http.FileSystem. This is done for safety so we are only caching static files and not dynamically generated content with this.One limitation is that on first request for a resource, it compresses the whole file before starting to return any data to the client. For large files like the Coder binaries, this can add 1-5 seconds to the time-to-first-byte, depending on the compression used.
I think this is reasonable: it only affects the very first download of the binary with a particular compression for a particular Coderd.
If we later find this unacceptible, we can fix it without changing interfaces. We can poll the file system to figure out how much data is available while the compression is inprogress.