Documentation
¶
Overview ¶
Package zerfoo provides the core building blocks for creating and training neural networks. It offers a prelude of commonly used types to simplify development and enhance readability of model construction code.
Index ¶
- func NewAdamW[T tensor.Numeric](learningRate, beta1, beta2, epsilon, weightDecay T) *optimizer.AdamW[T]
- func NewCPUEngine[T tensor.Numeric]() compute.Engine[T]
- func NewDefaultTrainer[T tensor.Numeric](g *graph.Graph[T], lossNode graph.Node[T], opt optimizer.Optimizer[T], ...) *training.DefaultTrainer[T]
- func NewFloat32Ops() numeric.Arithmetic[float32]
- func NewGraph[T tensor.Numeric](engine compute.Engine[T]) *graph.Builder[T]
- func NewMSE[T tensor.Numeric](engine compute.Engine[T]) *loss.MSE[T]
- func NewRMSNorm[T tensor.Numeric](name string, engine compute.Engine[T], ops numeric.Arithmetic[T], modelDim int, ...) (*normalization.RMSNorm[T], error)
- func NewTensor[T tensor.Numeric](shape []int, data []T) (*tensor.TensorNumeric[T], error)
- func RegisterLayer[T tensor.Numeric](opType string, builder model.LayerBuilder[T])
- func UnregisterLayer(opType string)
- type Batch
- type Embedding
- type Engine
- type GenerateOption
- func WithGenMaxTokens(n int) GenerateOption
- func WithGenTemperature(t float32) GenerateOption
- func WithGenTopP(p float32) GenerateOption
- func WithSchema(schema grammar.JSONSchema) GenerateOption
- func WithToolChoice(choice serve.ToolChoice) GenerateOption
- func WithTools(tools ...serve.Tool) GenerateOption
- type GenerateResult
- type Graph
- type LayerBuilder
- type Model
- func (m *Model) Chat(prompt string) (string, error)
- func (m *Model) ChatStream(ctx context.Context, prompt string, opts ...GenerateOption) (<-chan StreamToken, error)
- func (m *Model) Close() error
- func (m *Model) Embed(texts []string) ([]Embedding, error)
- func (m *Model) Generate(ctx context.Context, prompt string, opts ...GenerateOption) (*GenerateResult, error)
- type Node
- type Numeric
- type Parameter
- type StreamToken
- type Tensor
- type ToolCall
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func NewAdamW ¶
func NewAdamW[T tensor.Numeric](learningRate, beta1, beta2, epsilon, weightDecay T) *optimizer.AdamW[T]
NewAdamW creates a new AdamW optimizer with the given hyperparameters.
Stable.
func NewCPUEngine ¶
NewCPUEngine creates a new CPU computation engine for the given numeric type.
Stable.
func NewDefaultTrainer ¶
func NewDefaultTrainer[T tensor.Numeric]( g *graph.Graph[T], lossNode graph.Node[T], opt optimizer.Optimizer[T], strategy training.GradientStrategy[T], ) *training.DefaultTrainer[T]
NewDefaultTrainer creates a new default trainer for the given graph, loss, optimizer, and gradient strategy.
Stable.
func NewFloat32Ops ¶
func NewFloat32Ops() numeric.Arithmetic[float32]
NewFloat32Ops returns the float32 arithmetic operations.
Stable.
func NewRMSNorm ¶
func NewRMSNorm[T tensor.Numeric](name string, engine compute.Engine[T], ops numeric.Arithmetic[T], modelDim int, options ...normalization.RMSNormOption[T]) (*normalization.RMSNorm[T], error)
NewRMSNorm creates a new RMSNorm normalization layer with the given configuration.
Stable.
func RegisterLayer ¶
func RegisterLayer[T tensor.Numeric](opType string, builder model.LayerBuilder[T])
RegisterLayer registers a new layer builder for the given operation type.
Stable.
func UnregisterLayer ¶
func UnregisterLayer(opType string)
UnregisterLayer unregisters the layer builder for the given operation type.
Stable.
Types ¶
type Batch ¶
type Batch[T tensor.Numeric] struct { Inputs map[graph.Node[T]]*tensor.TensorNumeric[T] Targets *tensor.TensorNumeric[T] }
Batch represents a training batch of inputs and targets.
Stable.
type Embedding ¶
type Embedding struct {
Vector []float32
}
Embedding holds a text embedding vector.
Stable.
func (Embedding) CosineSimilarity ¶
CosineSimilarity computes the cosine similarity between two embeddings.
Stable.
type GenerateOption ¶
type GenerateOption func(*generateOptions)
GenerateOption configures the behavior of Model.Generate.
Stable.
func WithGenMaxTokens ¶
func WithGenMaxTokens(n int) GenerateOption
WithGenMaxTokens sets the maximum number of tokens to generate.
Stable.
func WithGenTemperature ¶
func WithGenTemperature(t float32) GenerateOption
WithGenTemperature sets the sampling temperature.
Stable.
func WithGenTopP ¶
func WithGenTopP(p float32) GenerateOption
WithGenTopP sets the top-p (nucleus) sampling parameter.
Stable.
func WithSchema ¶
func WithSchema(schema grammar.JSONSchema) GenerateOption
WithSchema enables grammar-guided decoding.
The model's output will be constrained to valid JSON matching the given schema.
Experimental.
func WithToolChoice ¶
func WithToolChoice(choice serve.ToolChoice) GenerateOption
WithToolChoice sets the tool choice mode for tool call detection.
Experimental.
func WithTools ¶
func WithTools(tools ...serve.Tool) GenerateOption
WithTools configures the tools available for tool call detection.
When tools are provided, Model.Generate will attempt to detect tool calls in the model output and populate [GenerateResult.ToolCalls].
Experimental.
type GenerateResult ¶
type GenerateResult struct {
Text string
TokenCount int
Duration time.Duration
ToolCalls []ToolCall
}
GenerateResult holds the result of a text generation call.
Stable.
type LayerBuilder ¶
type LayerBuilder[T tensor.Numeric] func( engine compute.Engine[T], ops numeric.Arithmetic[T], name string, params map[string]*graph.Parameter[T], attributes map[string]interface{}, ) (graph.Node[T], error)
LayerBuilder is a function that builds a computation graph layer.
Stable.
type Model ¶
type Model struct {
// contains filtered or unexported fields
}
Model is a loaded language model ready for inference.
A Model is created via Load and used for text generation, embedding, and tool-call detection. Model.Close must be called when the model is no longer needed to release GPU and CPU resources.
Stable.
func Load ¶
Load loads a model from a file path or HuggingFace model ID.
Paths starting with "/", "./" or "../" are treated as local GGUF files. All other strings are treated as HuggingFace model IDs (e.g. "google/gemma-3-4b" or "google/gemma-3-4b/Q8_0"). If the model is not cached locally it will be downloaded from HuggingFace.
Stable.
func (*Model) ChatStream ¶
func (m *Model) ChatStream(ctx context.Context, prompt string, opts ...GenerateOption) (<-chan StreamToken, error)
ChatStream starts streaming generation and returns a receive-only channel that yields StreamToken values as they are generated. The channel is closed when generation completes or ctx is canceled. The error return is non-nil only if startup fails (e.g. the model is not loaded).
Stable.
func (*Model) Embed ¶
Embed returns embeddings for the given texts.
Each input string is tokenized, its token embeddings are looked up from the model's embedding table, mean-pooled, and L2-normalized.
Stable.
func (*Model) Generate ¶
func (m *Model) Generate(ctx context.Context, prompt string, opts ...GenerateOption) (*GenerateResult, error)
Generate runs text generation with the given prompt and options.
Stable.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package autoopt provides automatic optimization recommendations based on hardware profiling.
|
Package autoopt provides automatic optimization recommendations based on hardware profiling. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package cloud provides a multi-tenant managed inference service for Zerfoo.
|
Package cloud provides a multi-tenant managed inference service for Zerfoo. |
|
cmd
|
|
|
bench
command
Command bench runs a standardized benchmark harness for zerfoo models.
|
Command bench runs a standardized benchmark harness for zerfoo models. |
|
bench-compare
command
Command bench-compare compares two NDJSON benchmark result files and outputs a markdown regression report.
|
Command bench-compare compares two NDJSON benchmark result files and outputs a markdown regression report. |
|
bench_batch
command
Command bench_batch benchmarks continuous batching vs session pool throughput.
|
Command bench_batch benchmarks continuous batching vs session pool throughput. |
|
bench_disagg
command
Command bench_disagg benchmarks disaggregated vs collocated serving throughput.
|
Command bench_disagg benchmarks disaggregated vs collocated serving throughput. |
|
bench_mamba
command
Command bench_mamba benchmarks Mamba-3 SSM vs Transformer attention decode throughput using synthetic FLOPs-based timing estimates.
|
Command bench_mamba benchmarks Mamba-3 SSM vs Transformer attention decode throughput using synthetic FLOPs-based timing estimates. |
|
bench_prefix
command
Command bench_prefix simulates a multi-turn chat workload to measure prefix cache hit rate and TTFT reduction.
|
Command bench_prefix simulates a multi-turn chat workload to measure prefix cache hit rate and TTFT reduction. |
|
bench_spec
command
Command bench_spec benchmarks speculative decoding speedup by comparing standalone target model decode against speculative decode (target + draft).
|
Command bench_spec benchmarks speculative decoding speedup by comparing standalone target model decode against speculative decode (target + draft). |
|
bench_tps
command
bench_tps measures tokens-per-second for a local ZMF model.
|
bench_tps measures tokens-per-second for a local ZMF model. |
|
cli
Package cli provides the command-line interface framework for Zerfoo.
|
Package cli provides the command-line interface framework for Zerfoo. |
|
coverage-gate
command
Command coverage-gate reads a Go coverage profile and fails if any testable package drops below the configured coverage threshold.
|
Command coverage-gate reads a Go coverage profile and fails if any testable package drops below the configured coverage threshold. |
|
debug-infer
command
|
|
|
deprecation-check
command
Package main implements a linter that checks // Deprecated: doc comments for proper replacement guidance and version information.
|
Package main implements a linter that checks // Deprecated: doc comments for proper replacement guidance and version information. |
|
finetune
command
Command finetune runs QLoRA fine-tuning on a GGUF model.
|
Command finetune runs QLoRA fine-tuning on a GGUF model. |
|
train_distributed
command
Command train_distributed launches distributed training using FSDP.
|
Command train_distributed launches distributed training using FSDP. |
|
ts_train
command
Command ts_train trains a PatchTST time-series signal model on offline feature data.
|
Command ts_train trains a PatchTST time-series signal model on offline feature data. |
|
zerfoo
command
|
|
|
zerfoo-edge
command
Package main provides a minimal edge/embedded inference binary for Zerfoo.
|
Package main provides a minimal edge/embedded inference binary for Zerfoo. |
|
zerfoo-predict
command
|
|
|
zerfoo-tokenize
command
|
|
|
Package compliance provides SOC 2 compliance automation tooling including Trust Services Criteria control mapping, evidence collection, policy document generation, and control status tracking.
|
Package compliance provides SOC 2 compliance automation tooling including Trust Services Criteria control mapping, evidence collection, policy document generation, and control status tracking. |
|
audit
Package audit provides SOC 2 Type I audit tooling including readiness assessment, evidence collection automation, gap analysis, and report generation.
|
Package audit provides SOC 2 Type I audit tooling including readiness assessment, evidence collection automation, gap analysis, and report generation. |
|
observation
Package observation implements the SOC 2 Type II observation period framework.
|
Package observation implements the SOC 2 Type II observation period framework. |
|
Package config provides file-based configuration loading with validation.
|
Package config provides file-based configuration loading with validation. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package data provides dataset containers for training batches and normalization.
|
Package data provides dataset containers for training batches and normalization. |
|
deploy
|
|
|
aws
Package aws provides AWS Marketplace Metering API integration for Zerfoo.
|
Package aws provides AWS Marketplace Metering API integration for Zerfoo. |
|
Package distributed provides multi-node distributed training for the Zerfoo ML framework.
|
Package distributed provides multi-node distributed training for the Zerfoo ML framework. |
|
coordinator
Package coordinator provides a distributed training coordinator.
|
Package coordinator provides a distributed training coordinator. |
|
fsdp
Package fsdp implements Fully Sharded Data Parallelism for distributed training.
|
Package fsdp implements Fully Sharded Data Parallelism for distributed training. |
|
docs
|
|
|
cookbook/01-basic-text-generation
command
Recipe 01: Basic Text Generation
|
Recipe 01: Basic Text Generation |
|
cookbook/02-streaming-chat
command
Recipe 02: Streaming Chat
|
Recipe 02: Streaming Chat |
|
cookbook/03-embedding-similarity
command
Recipe 03: Embedding and Cosine Similarity
|
Recipe 03: Embedding and Cosine Similarity |
|
cookbook/04-openai-server
command
Recipe 04: OpenAI-Compatible Server
|
Recipe 04: OpenAI-Compatible Server |
|
cookbook/05-custom-sampling
command
Recipe 05: Custom Sampling Parameters
|
Recipe 05: Custom Sampling Parameters |
|
cookbook/06-structured-json-output
command
Recipe 06: Structured JSON Output
|
Recipe 06: Structured JSON Output |
|
cookbook/07-lora-fine-tuning
command
Recipe 07: Fine-Tuning with LoRA
|
Recipe 07: Fine-Tuning with LoRA |
|
cookbook/08-batch-inference
command
Recipe 08: Batch Inference
|
Recipe 08: Batch Inference |
|
cookbook/09-speculative-decoding
command
Recipe 09: Speculative Decoding
|
Recipe 09: Speculative Decoding |
|
cookbook/10-tool-calling
command
Recipe 10: Tool / Function Calling
|
Recipe 10: Tool / Function Calling |
|
cookbook/11-rag
command
Recipe 11: Retrieval-Augmented Generation (RAG)
|
Recipe 11: Retrieval-Augmented Generation (RAG) |
|
cookbook/12-vision-multimodal
command
Recipe 12: Vision / Multimodal Inference
|
Recipe 12: Vision / Multimodal Inference |
|
examples
|
|
|
agentic-tool-use
command
Command agentic-tool-use demonstrates function calling (tool use) with a language model using the zerfoo one-line API.
|
Command agentic-tool-use demonstrates function calling (tool use) with a language model using the zerfoo one-line API. |
|
api-server
command
Command api-server demonstrates starting an OpenAI-compatible inference server.
|
Command api-server demonstrates starting an OpenAI-compatible inference server. |
|
audio-transcription
command
Command audio-transcription demonstrates speech-to-text using the Zerfoo OpenAI-compatible API server.
|
Command audio-transcription demonstrates speech-to-text using the Zerfoo OpenAI-compatible API server. |
|
automl
command
Command automl demonstrates using the AutoML coordinator to search over hyperparameter configurations with Bayesian optimization and early stopping.
|
Command automl demonstrates using the AutoML coordinator to search over hyperparameter configurations with Bayesian optimization and early stopping. |
|
chat
command
Command chat demonstrates a simple interactive chatbot using the zerfoo one-line API.
|
Command chat demonstrates a simple interactive chatbot using the zerfoo one-line API. |
|
classification
command
Command classification demonstrates text classification using grammar-constrained decoding to guarantee a valid JSON response with a category label.
|
Command classification demonstrates text classification using grammar-constrained decoding to guarantee a valid JSON response with a category label. |
|
code-completion
command
Command code-completion demonstrates using a language model for code completion.
|
Command code-completion demonstrates using a language model for code completion. |
|
distributed-training
command
Command distributed-training demonstrates setting up FSDP distributed training with gradient accumulation using the zerfoo distributed and training packages.
|
Command distributed-training demonstrates setting up FSDP distributed training with gradient accumulation using the zerfoo distributed and training packages. |
|
embedding
command
Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler.
|
Command embedding demonstrates embedding Zerfoo inference inside a Go HTTP handler. |
|
embedding-search
command
Command embedding-search demonstrates semantic search using model embeddings.
|
Command embedding-search demonstrates semantic search using model embeddings. |
|
fine-tuning
command
Command fine-tuning demonstrates parameter-efficient fine-tuning using LoRA (Low-Rank Adaptation) on a tabular model.
|
Command fine-tuning demonstrates parameter-efficient fine-tuning using LoRA (Low-Rank Adaptation) on a tabular model. |
|
inference
command
Command inference demonstrates loading a GGUF model and generating text.
|
Command inference demonstrates loading a GGUF model and generating text. |
|
json-output
command
Command json-output demonstrates grammar-guided decoding with a JSON schema.
|
Command json-output demonstrates grammar-guided decoding with a JSON schema. |
|
langchain-chatbot
command
Command langchain-chatbot demonstrates using the Zerfoo LangChain adapter as a drop-in LLM for a simple interactive chatbot loop.
|
Command langchain-chatbot demonstrates using the Zerfoo LangChain adapter as a drop-in LLM for a simple interactive chatbot loop. |
|
rag
command
Command rag demonstrates retrieval-augmented generation using Zerfoo.
|
Command rag demonstrates retrieval-augmented generation using Zerfoo. |
|
streaming
command
Command streaming demonstrates streaming chat generation using the zerfoo API.
|
Command streaming demonstrates streaming chat generation using the zerfoo API. |
|
summarization
command
Command summarization demonstrates text summarization using a GGUF language model.
|
Command summarization demonstrates text summarization using a GGUF language model. |
|
text-embedding
command
Command text-embedding demonstrates extracting text embedding vectors from a loaded GGUF model using the inference package.
|
Command text-embedding demonstrates extracting text embedding vectors from a loaded GGUF model using the inference package. |
|
timeseries
command
Command timeseries demonstrates time-series forecasting with the N-BEATS model using the zerfoo timeseries package.
|
Command timeseries demonstrates time-series forecasting with the N-BEATS model using the zerfoo timeseries package. |
|
translation
command
Command translation demonstrates text translation using a GGUF language model.
|
Command translation demonstrates text translation using a GGUF language model. |
|
vision-analysis
command
Command vision-analysis demonstrates multimodal inference with image input.
|
Command vision-analysis demonstrates multimodal inference with image input. |
|
weaviate-search
command
Command weaviate-search demonstrates using the Zerfoo Weaviate adapter to embed a corpus of documents and perform cosine-similarity semantic search without requiring a live Weaviate instance.
|
Command weaviate-search demonstrates using the Zerfoo Weaviate adapter to embed a corpus of documents and perform cosine-similarity semantic search without requiring a live Weaviate instance. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package federated provides federated learning interfaces and a FedAvg baseline implementation.
|
Package federated provides federated learning interfaces and a FedAvg baseline implementation. |
|
Package generate implements autoregressive text generation for transformer models loaded by the inference package.
|
Package generate implements autoregressive text generation for transformer models loaded by the inference package. |
|
agent
Package agent implements the agentic tool-use loop for multi-step reasoning.
|
Package agent implements the agentic tool-use loop for multi-step reasoning. |
|
grammar
Package grammar converts a subset of JSON Schema into a context-free grammar for constrained decoding.
|
Package grammar converts a subset of JSON Schema into a context-free grammar for constrained decoding. |
|
speculative
Package speculative implements speculative decoding strategies for accelerated generation.
|
Package speculative implements speculative decoding strategies for accelerated generation. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package health provides HTTP health check endpoints for Kubernetes-style liveness and readiness probes.
|
Package health provides HTTP health check endpoints for Kubernetes-style liveness and readiness probes. |
|
Package inference provides a high-level API for loading GGUF models and running text generation, chat, embedding, and speculative decoding with minimal boilerplate.
|
Package inference provides a high-level API for loading GGUF models and running text generation, chat, embedding, and speculative decoding with minimal boilerplate. |
|
multimodal
Package multimodal provides audio preprocessing for audio-language model inference.
|
Package multimodal provides audio preprocessing for audio-language model inference. |
|
parallel
Package parallel provides tensor and pipeline parallelism for distributing inference across multiple GPUs.
|
Package parallel provides tensor and pipeline parallelism for distributing inference across multiple GPUs. |
|
sentiment
Package sentiment provides a high-level sentiment classification pipeline that wraps encoder model loading and inference.
|
Package sentiment provides a high-level sentiment classification pipeline that wraps encoder model loading and inference. |
|
timeseries
Package timeseries implements time-series model builders.
|
Package timeseries implements time-series model builders. |
|
timeseries/features
Package features provides a feature store for the Wolf time-series ML platform.
|
Package features provides a feature store for the Wolf time-series ML platform. |
|
integrations
|
|
|
langchain
Package langchain provides an adapter that makes Zerfoo's OpenAI-compatible HTTP API compatible with LangChain-Go's LLM interface.
|
Package langchain provides an adapter that makes Zerfoo's OpenAI-compatible HTTP API compatible with LangChain-Go's LLM interface. |
|
weaviate
Package weaviate provides an adapter for generating embeddings via Zerfoo's OpenAI-compatible HTTP API and inserting them into a Weaviate vector database client.
|
Package weaviate provides an adapter for generating embeddings via Zerfoo's OpenAI-compatible HTTP API and inserting them into a Weaviate vector database client. |
|
internal
|
|
|
clblast
Package clblast provides Go wrappers for the CLBlast BLAS library.
|
Package clblast provides Go wrappers for the CLBlast BLAS library. |
|
codegen
Package codegen generates CUDA megakernel source code from compiled computation graphs.
|
Package codegen generates CUDA megakernel source code from compiled computation graphs. |
|
cublas
Package cublas provides low-level purego bindings for the cuBLAS library.
|
Package cublas provides low-level purego bindings for the cuBLAS library. |
|
cuda
Package cuda provides low-level bindings for the CUDA runtime API using (Stability: stable) dlopen/dlsym (no CGo).
|
Package cuda provides low-level bindings for the CUDA runtime API using (Stability: stable) dlopen/dlsym (no CGo). |
|
cuda/kernels
Package kernels provides Go wrappers for custom CUDA kernels.
|
Package kernels provides Go wrappers for custom CUDA kernels. |
|
cudnn
Package cudnn provides purego bindings for the NVIDIA cuDNN library.
|
Package cudnn provides purego bindings for the NVIDIA cuDNN library. |
|
gpuapi
Package gpuapi defines internal interfaces for GPU runtime operations.
|
Package gpuapi defines internal interfaces for GPU runtime operations. |
|
hip
Package hip provides low-level bindings for the AMD HIP runtime API (Stability: alpha) using purego dlopen.
|
Package hip provides low-level bindings for the AMD HIP runtime API (Stability: alpha) using purego dlopen. |
|
hip/kernels
Package kernels provides Go wrappers for custom HIP kernels via purego (Stability: alpha) dlopen.
|
Package kernels provides Go wrappers for custom HIP kernels via purego (Stability: alpha) dlopen. |
|
miopen
Package miopen provides low-level bindings for the AMD MIOpen library (Stability: alpha) using purego dlopen.
|
Package miopen provides low-level bindings for the AMD MIOpen library (Stability: alpha) using purego dlopen. |
|
nccl
Package nccl provides CGo bindings for the NVIDIA Collective Communications (Stability: beta) Library (NCCL).
|
Package nccl provides CGo bindings for the NVIDIA Collective Communications (Stability: beta) Library (NCCL). |
|
opencl
Package opencl provides Go wrappers for the OpenCL 2.0 runtime API.
|
Package opencl provides Go wrappers for the OpenCL 2.0 runtime API. |
|
opencl/kernels
Package kernels provides OpenCL kernel source and dispatch for elementwise operations.
|
Package kernels provides OpenCL kernel source and dispatch for elementwise operations. |
|
rocblas
Package rocblas provides low-level bindings for the AMD rocBLAS library (Stability: alpha) using purego dlopen.
|
Package rocblas provides low-level bindings for the AMD rocBLAS library (Stability: alpha) using purego dlopen. |
|
tensorrt
Package tensorrt provides bindings for the NVIDIA TensorRT inference (Stability: alpha) library via purego (dlopen/dlsym, no CGo).
|
Package tensorrt provides bindings for the NVIDIA TensorRT inference (Stability: alpha) library via purego (dlopen/dlsym, no CGo). |
|
workerpool
Package workerpool provides a persistent pool of goroutines that process submitted tasks.
|
Package workerpool provides a persistent pool of goroutines that process submitted tasks. |
|
xblas
Package xblas provides CPU BLAS wrappers with ARM NEON and AVX2 SIMD assembly.
|
Package xblas provides CPU BLAS wrappers with ARM NEON and AVX2 SIMD assembly. |
|
Package layers provides neural network layer implementations for the Zerfoo ML framework.
|
Package layers provides neural network layer implementations for the Zerfoo ML framework. |
|
activations
Package activations provides activation function layers.
|
Package activations provides activation function layers. |
|
attention
Package attention provides attention mechanisms for neural networks.
|
Package attention provides attention mechanisms for neural networks. |
|
audio
Package audio provides audio-related neural network layers.
|
Package audio provides audio-related neural network layers. |
|
components
Package components provides reusable components for neural network layers.
|
Package components provides reusable components for neural network layers. |
|
core
Package core provides core neural network layer implementations.
|
Package core provides core neural network layer implementations. |
|
embeddings
Package embeddings provides neural network embedding layers.
|
Package embeddings provides neural network embedding layers. |
|
gather
Package gather provides the Gather layer for embedding-table lookup.
|
Package gather provides the Gather layer for embedding-table lookup. |
|
hrm
Package hrm implements the Hierarchical Reasoning Model.
|
Package hrm implements the Hierarchical Reasoning Model. |
|
normalization
Package normalization provides normalization layers for neural networks.
|
Package normalization provides normalization layers for neural networks. |
|
recurrent
Package recurrent provides recurrent neural network layers.
|
Package recurrent provides recurrent neural network layers. |
|
reducesum
Package reducesum provides the ReduceSum layer for axis-wise reduction.
|
Package reducesum provides the ReduceSum layer for axis-wise reduction. |
|
registry
Package registry provides a central registration point for all layer builders.
|
Package registry provides a central registration point for all layer builders. |
|
regularization
Package regularization provides regularization layers for neural networks.
|
Package regularization provides regularization layers for neural networks. |
|
residual
Package residual provides residual connection layers for neural networks.
|
Package residual provides residual connection layers for neural networks. |
|
ssm
Package ssm implements state space model layers.
|
Package ssm implements state space model layers. |
|
timeseries
Package timeseries provides time-series specific neural network layers.
|
Package timeseries provides time-series specific neural network layers. |
|
transformer
Package transformer provides transformer building blocks such as the Transformer `Block` used in encoder/decoder stacks.
|
Package transformer provides transformer building blocks such as the Transformer `Block` used in encoder/decoder stacks. |
|
transpose
Package transpose provides the Transpose layer for axis permutation.
|
Package transpose provides the Transpose layer for axis permutation. |
|
vision
Package vision provides vision-related neural network layers.
|
Package vision provides vision-related neural network layers. |
|
Package marketplace provides a unified abstraction layer for cloud marketplace integrations across AWS, GCP, and Azure.
|
Package marketplace provides a unified abstraction layer for cloud marketplace integrations across AWS, GCP, and Azure. |
|
aws
Package aws provides AWS Marketplace integration for Zerfoo Cloud, including metering, subscription lifecycle management, entitlement verification, and token-based billing.
|
Package aws provides AWS Marketplace integration for Zerfoo Cloud, including metering, subscription lifecycle management, entitlement verification, and token-based billing. |
|
azure
Package azure provides Azure Marketplace integration for Zerfoo Cloud, including SaaS Fulfillment API v2, Marketplace Metering Service, subscription lifecycle management, and webhook handling.
|
Package azure provides Azure Marketplace integration for Zerfoo Cloud, including SaaS Fulfillment API v2, Marketplace Metering Service, subscription lifecycle management, and webhook handling. |
|
gcp
Package gcp provides GCP Marketplace integration for Zerfoo Cloud, including Cloud Commerce Partner Procurement API integration, SaaS entitlement management, Service Control API usage metering, and token-based billing.
|
Package gcp provides GCP Marketplace integration for Zerfoo Cloud, including Cloud Commerce Partner Procurement API integration, SaaS entitlement management, Service Control API usage metering, and token-based billing. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package mobile provides gomobile-compatible bindings for zerfoo inference.
|
Package mobile provides gomobile-compatible bindings for zerfoo inference. |
|
Package model provides adapter implementations for bridging existing and new model interfaces.
|
Package model provides adapter implementations for bridging existing and new model interfaces. |
|
gguf
Package gguf provides GGUF file format parsing and writing.
|
Package gguf provides GGUF file format parsing and writing. |
|
hrm
Package hrm provides experimental Hierarchical Reasoning Model types.
|
Package hrm provides experimental Hierarchical Reasoning Model types. |
|
huggingface
Package huggingface provides HuggingFace model configuration parsing.
|
Package huggingface provides HuggingFace model configuration parsing. |
|
Package modelcache provides an LRU model file cache for pre-caching GGUF models on Kubernetes nodes via a DaemonSet.
|
Package modelcache provides an LRU model file cache for pre-caching GGUF models on Kubernetes nodes via a DaemonSet. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package registry provides a model registry with local cache, pull, get, list, and delete operations.
|
Package registry provides a model registry with local cache, pull, get, list, and delete operations. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package security implements SOC 2 security controls for the Zerfoo ML framework.
|
Package security implements SOC 2 security controls for the Zerfoo ML framework. |
|
Package serve provides an OpenAI-compatible HTTP API server for model inference.
|
Package serve provides an OpenAI-compatible HTTP API server for model inference. |
|
adaptive
Package adaptive implements an adaptive batch scheduler that dynamically adjusts batch size based on queue depth and latency targets to maximize throughput while meeting latency SLOs.
|
Package adaptive implements an adaptive batch scheduler that dynamically adjusts batch size based on queue depth and latency targets to maximize throughput while meeting latency SLOs. |
|
agent
Package agent adapts the generate/agent agentic loop to the serving layer.
|
Package agent adapts the generate/agent agentic loop to the serving layer. |
|
batcher
Package batcher implements a continuous batching scheduler for inference serving.
|
Package batcher implements a continuous batching scheduler for inference serving. |
|
cloud
Package cloud provides multi-tenant namespace isolation for the serving layer.
|
Package cloud provides multi-tenant namespace isolation for the serving layer. |
|
disaggregated
Package disaggregated implements disaggregated prefill/decode serving.
|
Package disaggregated implements disaggregated prefill/decode serving. |
|
disaggregated/proto
Package disaggpb defines the gRPC service contracts for disaggregated prefill/decode serving.
|
Package disaggpb defines the gRPC service contracts for disaggregated prefill/decode serving. |
|
multimodel
Package multimodel provides a ModelManager that loads and unloads models on demand with LRU eviction when GPU memory budget is exceeded.
|
Package multimodel provides a ModelManager that loads and unloads models on demand with LRU eviction when GPU memory budget is exceeded. |
|
operator
Package operator provides a Kubernetes operator for managing ZerfooInferenceService custom resources.
|
Package operator provides a Kubernetes operator for managing ZerfooInferenceService custom resources. |
|
registry
Package registry provides a bbolt-backed model version registry for tracking and A/B testing.
|
Package registry provides a bbolt-backed model version registry for tracking and A/B testing. |
|
repository
Package repository provides a model repository for storing and managing GGUF model files.
|
Package repository provides a model repository for storing and managing GGUF model files. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package shutdown provides orderly shutdown coordination using context cancellation and cleanup callbacks.
|
Package shutdown provides orderly shutdown coordination using context cancellation and cleanup callbacks. |
|
Package support implements an enterprise support ticketing system with priority routing, SLA tracking, and webhook notifications.
|
Package support implements an enterprise support ticketing system with priority routing, SLA tracking, and webhook notifications. |
|
Experimental — this package is not yet wired into the main framework.
|
Experimental — this package is not yet wired into the main framework. |
|
Package tabular provides tabular ML model types.
|
Package tabular provides tabular ML model types. |
|
testing
|
|
|
benchmark
Package benchmark provides a standardized benchmark suite for measuring ML model inference performance: tok/s decode, tok/s prefill, memory usage, and time to first token.
|
Package benchmark provides a standardized benchmark suite for measuring ML model inference performance: tok/s decode, tok/s prefill, memory usage, and time to first token. |
|
compare
Package compare provides a model comparison tool that runs the same prompts through multiple models and compares their performance metrics.
|
Package compare provides a model comparison tool that runs the same prompts through multiple models and compares their performance metrics. |
|
tests
|
|
|
training
Package training contains end-to-end training loop integration tests.
|
Package training contains end-to-end training loop integration tests. |
|
Package timeseries provides time-series forecasting models built on ztensor.
|
Package timeseries provides time-series forecasting models built on ztensor. |
|
Package training provides adapter implementations for bridging existing and new interfaces.
|
Package training provides adapter implementations for bridging existing and new interfaces. |
|
automl
Package automl provides automated machine learning utilities including Bayesian hyperparameter optimization.
|
Package automl provides automated machine learning utilities including Bayesian hyperparameter optimization. |
|
fp8
Package fp8 implements FP8 mixed-precision training support.
|
Package fp8 implements FP8 mixed-precision training support. |
|
lora
Package lora implements LoRA and QLoRA fine-tuning adapters.
|
Package lora implements LoRA and QLoRA fine-tuning adapters. |
|
loss
Package loss provides various loss functions for neural networks.
|
Package loss provides various loss functions for neural networks. |
|
nas
Package nas implements neural architecture search using DARTS.
|
Package nas implements neural architecture search using DARTS. |
|
online
Package online implements online learning with drift detection and model rollback.
|
Package online implements online learning with drift detection and model rollback. |
|
optimizer
Package optimizer provides various optimization algorithms for neural networks.
|
Package optimizer provides various optimization algorithms for neural networks. |
|
scheduler
Package scheduler provides learning rate scheduling strategies for optimizers.
|
Package scheduler provides learning rate scheduling strategies for optimizers. |