examples

ZigCUDA Examples

This directory contains practical examples demonstrating how to use the ZigCUDA library.

Prerequisites

CUDA-capable GPU with driver installed
Zig compiler (0.15.2 or later)
CUDA Toolkit (for PTX compilation, optional)

Examples Overview

01_device_info.zig

Basic device enumeration and querying

Demonstrates:

Initializing CUDA context
Enumerating available devices
Querying device properties (name, compute capability, memory, SM count)

zig build-exe examples/01_device_info.zig --dep zigcuda --mod zigcuda::src/lib.zig
./01_device_info

02_memory_transfer.zig

Host ↔ Device memory operations

Demonstrates:

Allocating device memory
Copying data from host to device
Copying data from device to host
Memory cleanup and verification

zig build-exe examples/02_memory_transfer.zig --dep zigcuda --mod zigcuda::src/lib.zig
./02_memory_transfer

03_kernel_launch.zig

Loading and launching CUDA kernels

Demonstrates:

Loading PTX modules
Extracting kernel functions
Setting up kernel parameters
Launching kernels with grid/block configuration
Result verification

zig build-exe examples/03_kernel_launch.zig --dep zigcuda --mod zigcuda::src/lib.zig
./03_kernel_launch

04_streams.zig

Asynchronous operations with CUDA streams

Demonstrates:

Creating multiple CUDA streams
Launching concurrent operations
Stream synchronization
Querying stream status

zig build-exe examples/04_streams.zig --dep zigcuda --mod zigcuda::src/lib.zig
./04_streams

Building All Examples

You can add these examples to your build.zig to make them easy to build:

// In build.zig, add:
const examples = [_][]const u8{
    "01_device_info",
    "02_memory_transfer",
    "03_kernel_launch",
    "04_streams",
};

for (examples) |example_name| {
    const exe = b.addExecutable(.{
        .name = example_name,
        .root_source_file = b.path(b.fmt("examples/{s}.zig", .{example_name})),
        .target = target,
        .optimize = optimize,
    });
    exe.root_module.addImport("zigcuda", lib_module);
    b.installArtifact(exe);
}

Then build with:

zig build

Kernels Directory

The kernels/ subdirectory contains PTX (Parallel Thread Execution) code for CUDA kernels:

vector_add.ptx: Vector addition kernel used by example 03

Compiling Your Own Kernels

For compute capability 12.0 (Blackwell):

nvcc --cubin -arch=compute_120 -code=sm_120 your_kernel.cu -o your_kernel.cubin

Expected Output

Example 01 - Device Info

Found 1 CUDA device(s)

Device 0: NVIDIA RTX PRO 6000 Blackwell Workstation Edition
  Compute Capability: 12.0
  Total Memory: 95.59 GB
  Multiprocessors: 188

Example 02 - Memory Transfer

=== CUDA Memory Transfer Example ===

Allocated 4096 bytes on host
Input data: [0.0, 1.0, 2.0, ..., 1023.0]
Allocated 4096 bytes on device

Copying data from host to device...
✓ Host-to-device transfer complete

Copying data from device to host...
✓ Device-to-host transfer complete

✓ SUCCESS: All 1024 elements transferred correctly
Output data: [0.0, 1.0, 2.0, ..., 1023.0]

Example 03 - Kernel Launch

=== CUDA Kernel Launch Example ===

Input A: [0.0, 1.0, 2.0, ..., 1023.0]
Input B: [0.0, 2.0, 4.0, ..., 2046.0]

✓ PTX module loaded
✓ Kernel function extracted

Launching kernel with 4 blocks x 256 threads
✓ Kernel launched

✓ SUCCESS: Vector addition completed correctly
Result C: [0.0, 3.0, 6.0, ..., 3069.0]
Verified: C[i] = A[i] + B[i] for all 1024 elements

Example 04 - Streams

=== CUDA Streams Example ===

Creating 3 CUDA streams...
✓ 3 streams created

Memory allocated for 3 streams
Each stream handles 512 elements (2048 bytes)

Launching async memory copies on all streams...
Synchronizing all streams...
✓ All streams synchronized
Time elapsed: 2 ms

Querying stream status...
  Stream 0: ✓ Complete
  Stream 1: ✓ Complete
  Stream 2: ✓ Complete

✓ SUCCESS: Streams example completed
Total data transferred: 6 KB across 3 streams

Troubleshooting

"CUDA driver not found"

Ensure your NVIDIA driver is properly installed:

nvidia-smi

"No CUDA devices found"

Check that your GPU is recognized:

lspci | grep -i nvidia

"PTX module load failed"

Ensure the PTX file exists and the path is correct. PTX files are embedded at compile time using @embedFile().

Next Steps

Modify the examples to experiment with different configurations
Create your own CUDA kernels
Explore the cuBLAS integration for matrix operations
Check the test suite in test/ for more advanced usage patterns

Name		Name	Last commit message	Last commit date
parent directory ..
kernels		kernels
01_device_info.zig		01_device_info.zig
02_memory_transfer.zig		02_memory_transfer.zig
03_kernel_launch.zig		03_kernel_launch.zig
03_kernel_launch_fixed.zig		03_kernel_launch_fixed.zig
04_streams.zig		04_streams.zig
README.md		README.md
build_kernels.sh		build_kernels.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

ZigCUDA Examples

Prerequisites

Examples Overview

01_device_info.zig

02_memory_transfer.zig

03_kernel_launch.zig

04_streams.zig

Building All Examples

Kernels Directory

Compiling Your Own Kernels

Expected Output

Example 01 - Device Info

Example 02 - Memory Transfer

Example 03 - Kernel Launch

Example 04 - Streams

Troubleshooting

"CUDA driver not found"

"No CUDA devices found"

"PTX module load failed"

Next Steps

FilesExpand file tree

examples

Directory actions

More options

Directory actions

More options

Latest commit

History

examples

Folders and files

parent directory

README.md

ZigCUDA Examples

Prerequisites

Examples Overview

01_device_info.zig

02_memory_transfer.zig

03_kernel_launch.zig

04_streams.zig

Building All Examples

Kernels Directory

Compiling Your Own Kernels

Expected Output

Example 01 - Device Info

Example 02 - Memory Transfer

Example 03 - Kernel Launch

Example 04 - Streams

Troubleshooting

"CUDA driver not found"

"No CUDA devices found"

"PTX module load failed"

Next Steps