Lock-free, memory-friendly bounded buffer pools for Go optimized for low-latency systems.
English | 简体中文 | Español | 日本語 | Français
iobuf utilizes the Spin and Adapt layers of our performance ecosystem:
- Strike: System call → Direct kernel hit.
- Spin: Hardware yield (
spin) → Local atomic synchronization. - Adapt: Software backoff (
iox.Backoff) → External I/O readiness.
- Bounded lock-free buffer pools for low-latency systems.
- Page-aligned memory allocation for DMA and io_uring compatibility.
- Zero-copy IoVec generation for vectored I/O syscalls.
- Cooperative back-off: Uses
iox.Backoffto handle resource exhaustion gracefully.
- Go 1.26+
- 64-bit CPU (amd64, arm64, riscv64, loong64, ppc64, s390x, mips64, etc.)
Note: 32-bit architectures are not supported due to 64-bit atomic operations in the lock-free pool implementation.
go get code.hybscloud.com/iobuf// Create a pool of 1024 small buffers (2 KiB each)
pool := iobuf.NewSmallBufferPool(1024)
pool.Fill(iobuf.NewSmallBuffer)
// Acquire a buffer index
idx, err := pool.Get()
if err != nil {
panic(err)
}
// Access the buffer directly (zero-copy)
buf := pool.Value(idx)
...
// Return to pool
pool.Put(idx)// Single page-aligned block (default page size)
block := iobuf.AlignedMemBlock()
// Custom size with explicit alignment
mem := iobuf.AlignedMem(65536, iobuf.PageSize)
// Multiple aligned blocks
blocks := iobuf.AlignedMemBlocks(16, iobuf.PageSize)// Convert tiered buffers to iovec for readv/writev
buffers := make([]iobuf.SmallBuffer, 8)
iovecs := iobuf.IoVecFrom(buffers)
// Get raw pointer and count for syscalls
addr, n := iobuf.IoVecAddrLen(iovecs)Power-of-4 progression starting at 32 bytes (12 tiers, 32 B to 128 MiB):
| Tier | Size | Use Case |
|---|---|---|
| Pico | 32 B | UUIDs, flags, tiny control messages |
| Nano | 128 B | HTTP headers, JSON tokens, small RPC payloads |
| Micro | 512 B | DNS packets, MQTT messages, protocol frames |
| Small | 2 KiB | WebSocket frames, small HTTP responses |
| Medium | 8 KiB | TCP segments, gRPC messages, page I/O |
| Big | 32 KiB | TLS records (16 KiB max), stream chunks |
| Large | 128 KiB | io_uring buffer rings, bulk network transfers |
| Great | 512 KiB | Database pages, large API responses |
| Huge | 2 MiB | Huge page aligned, memory-mapped files |
| Vast | 8 MiB | Image processing, compressed archives |
| Giant | 32 MiB | Video frames, ML model weights |
| Titan | 128 MiB | Large datasets, maximum stack-safe buffer |
// Generic pool interface
type Pool[T any] interface {
Put(item T) error
Get() (item T, err error)
}
// Index-based pool for zero-copy buffer access
type IndirectPool[T BufferType] interface {
Pool[int]
Value(indirect int) T
SetValue(indirect int, item T)
}func NewPicoBufferPool(capacity int) *PicoBufferBoundedPool
func NewNanoBufferPool(capacity int) *NanoBufferBoundedPool
func NewMicroBufferPool(capacity int) *MicroBufferBoundedPool
func NewSmallBufferPool(capacity int) *SmallBufferBoundedPool
func NewMediumBufferPool(capacity int) *MediumBufferBoundedPool
func NewBigBufferPool(capacity int) *BigBufferBoundedPool
func NewLargeBufferPool(capacity int) *LargeBufferBoundedPool
func NewGreatBufferPool(capacity int) *GreatBufferBoundedPool
func NewHugeBufferPool(capacity int) *HugeBufferBoundedPool
func NewVastBufferPool(capacity int) *VastBufferBoundedPool
func NewGiantBufferPool(capacity int) *GiantBufferBoundedPool
func NewTitanBufferPool(capacity int) *TitanBufferBoundedPool// Page-aligned memory
func AlignedMem(size int, pageSize uintptr) []byte
func AlignedMemBlocks(n int, pageSize uintptr) [][]byte
func AlignedMemBlock() []byte
// Cache-line-aligned memory (prevents false sharing)
func CacheLineAlignedMem(size int) []byte
func CacheLineAlignedMemBlocks(n int, blockSize int) [][]byte
const CacheLineSize // 64 or 128 depending on architecturefunc IoVecFrom[T BufferType](buffers []T) []IoVec
func IoVecFromBytesSlice(iov [][]byte) (addr uintptr, n int)
func IoVecFromRegisteredBuffers(buffers []RegisterBuffer) []IoVec
func IoVecAddrLen(vec []IoVec) (addr uintptr, n int)The bounded pool implementation is based on lock-free queue algorithms:
- Memory-efficient: O(n) space for n-capacity pool
- Lock-free progress: Guaranteed global progress bounds
- Cache-friendly: Minimizes false sharing and cache-line bouncing
- Morrison & Afek, "Fast concurrent queues for x86 processors," PPoPP 2013
- Nikolaev, "A scalable, portable, and memory-efficient lock-free FIFO queue," DISC 2019
- Koval & Aksenov, "Restricted memory-friendly lock-free bounded queues," PPoPP 2020
- Nikolaev & Ravindran, "wCQ: A fast wait-free queue with bounded memory usage," 2022
- Aksenov et al., "Memory bounds for concurrent bounded queues," 2024
- Denis & Goedefroit, "NBLFQ: A lock-free MPMC queue optimized for low contention," IPDPS 2025
MIT License - see LICENSE for details.
© 2025 Hayabusa Cloud Co., Ltd.