High-performance peer-to-peer DMA ring driver and toolchain for Thunderbolt 5, enabling GPU-to-GPU communication (via RCCL/NCCL), distributed file access, and performance testing between TB5-connected machines.
| System | USB4 | USB4v2 / TB5 | Status | Tester |
|---|---|---|---|---|
| Minisforum MS-S1 | Working | Partially-Working | Requires power on with cable connected BIOS 1.06 | @Geramy |
Please submit a ticket proving your system works and i'll add you to the list.
- Firstly I would like to see if I can force TB5 to use more channels for TX on one host and RX on the other.
- Secondly I will add high performance mode which uses CPU polling at a higher rate to reduce latency The overall goal is to get latency to the level of infiniband RDMA / RoCE 2 - 5us
OdinLink turns a Thunderbolt 5 cable into a high-speed interconnect between two Linux machines. The kernel driver manages NHI DMA rings over the TB5 PCIe tunnel, providing:
- 80 Gbps raw throughput (Thunderbolt 5 bandwidth)
- Sub-microsecond latency for control messages
- Zero-copy GPU transfers via DMA-buf (RCCL plugin)
- Configurable ring size up to 64 MB per batch (256 MB total with double-buffered TX+RX)
- Character device interface (
/dev/odl_tb5_N) with mmap'd double buffers
| Component | Binary / Module | Description |
|---|---|---|
| Kernel Driver | odl_tb5.ko |
Thunderbolt service driver: NHI ring allocation, DMA buffer management, XDomain login/logout protocol |
| Userspace Library | libodl_tb5.so |
C API for device open/close, double-buffer mmap, send/recv, DMA-buf, peer discovery |
| RCCL Plugin | librccl_net_odl_tb5.so |
RCCL Net v7 network plugin for AMD GPU collective operations over TB5. Exposes shared-memory stats at /run/odl_tb5/rccl_stats. |
| CLI Tool | odl_tb5_cli |
Client/server test tool: bandwidth, latency, jitter, latency-under-load, MIMO tests |
| System Daemon | odl_tb5_daemon |
Background D-Bus service: device monitoring, test execution, RCCL stats, file operations |
| Tray Application | odl_tb5_tray |
GTK3 system tray app: peer status, test runner, RCCL stats display, file management |
| Test Suite | odl_tb5_test |
Unit and integration tests for the library and plugin |
- OS: Ubuntu 24.04 LTS
- Kernel: 6.18.7 (Thunderbolt 5 support required)
- Compiler: GCC 14+ (must match kernel build compiler)
- Build System: CMake 3.10+
- Hardware: Thunderbolt 5 ports with bridge cable
sudo apt update
sudo apt install build-essential cmake linux-headers-$(uname -r) pkg-config
# GCC version must match your kernel (check with: cat /proc/version)
sudo apt install gcc-14 # for kernel 6.18+The core components (driver, library, RCCL plugin, CLI, tests) build with no extra dependencies:
git clone <repository-url> OdinLink-Five
cd OdinLink-Five
mkdir build && cd build
cmake ..
make -j$(nproc)The daemon and tray application require additional libraries. CMake auto-detects them and disables components if dependencies are missing.
Daemon dependencies (D-Bus service, device monitoring, test execution):
sudo apt install libglib2.0-devTray application dependencies (system tray icon + GTK3 UI):
sudo apt install libgtk-3-dev libayatana-appindicator3-devOptional - FUSE distributed file access (transparent remote file reads over DMA):
sudo apt install libfuse3-devOptional - SHA-256 for file operations (used by the file transfer protocol):
sudo apt install libssl-devThen rebuild:
cd build
cmake .. && make -j$(nproc)CMake will report which components are enabled:
-- BUILD_DAEMON: ON
-- BUILD_TRAY: ON
# Load with default ring size (4096 entries = 16 MB per batch)
sudo insmod driver/odl_tb5.ko odl_ring_size=1024
# Or load with custom ring size (power of 2, 64-16384)
sudo insmod driver/odl_tb5.ko ring_size=16384 # 64 MB per batch
# Verify
lsmod | grep odl_tb5
ls /dev/odl_tb5_*
# Install udev rule for persistent permissions
sudo cp driver/71-odl-tb5.rules /etc/udev/rules.d/
sudo udevadm control --reload-rulesBoth machines must have the driver loaded and be connected via TB5 cable.
# Machine A (server):
./build/cli/odl_tb5_cli --server --device 0
# Machine B (client):
./build/cli/odl_tb5_cli --client --device 0 --test bandwidth
./build/cli/odl_tb5_cli --client --device 0 --test latency
./build/cli/odl_tb5_cli --client --device 0 --test jitter
./build/cli/odl_tb5_cli --client --device 0 --test latency-load
./build/cli/odl_tb5_cli --client --device 0 --test mimo# Start daemon (foreground for debugging):
./build/daemon/odl_tb5_daemon -f
# Or install the systemd user service:
systemctl --user enable --now odl-tb5-daemon
# Start tray application:
./build/tray/odl_tb5_tray Machine A Machine B
+-----------+ +-----------+
| Tray App | D-Bus | Tray App |
| (GTK3) |<------>+ | (GTK3) |
+-----------+ | +-----------+
v ^
+------------+ +------------+
| Daemon | | Daemon |
| (GLib/GIO) | | (GLib/GIO) |
+------+-----+ +-----+------+
| |
+------v-----+ +-----v------+
| libodl_tb5 | | libodl_tb5 |
+------+-----+ +-----+------+
| |
+------v-----+ +-----v------+
| odl_tb5.ko | | odl_tb5.ko |
+------+-----+ +-----+------+
| |
+---< TB5 Cable >---+
80 Gbps DMA
- Internal double-buffer path: mmap'd 16-64 MB buffers for CLI tests, file transfers, control messages
- External DMA-buf path: zero-copy GPU memory transfers for RCCL collective operations
odl_tb5.ko is built from four source files:
| File | Purpose |
|---|---|
odl_tb5_service.c |
Thunderbolt service probe/remove, module init/exit, ring size module parameter |
odl_tb5_ring_dma.c |
NHI ring allocation, dynamic frame arrays, DMA buffer management |
odl_tb5_chardev.c |
Character device /dev/odl_tb5_N, ioctl dispatch, mmap handler |
odl_tb5_proto.c |
XDomain login/logout handshake over Thunderbolt properties protocol |
| Parameter | Default | Range | Description |
|---|---|---|---|
ring_size |
4096 | 64-16384 | NHI ring entries per direction (power of 2). Each entry = 4 KB. Default = 16 MB per batch, 64 MB total. |
export RCCL_NET_PLUGIN=ODL_TB5
export RCCL_PLUGIN_DIR=/path/to/build/rccl
# Your RCCL/ROCm application will use TB5 automaticallyThe RCCL plugin exports shared-memory statistics at /run/odl_tb5/rccl_stats. The daemon reads these and exposes them via D-Bus; the tray app displays TX/RX bytes, operation counts, and uptime in a dedicated RCCL Stats window.
OdinLink-Five/
+-- CMakeLists.txt Root build config + CPack packaging
+-- README.md
+-- driver/ Kernel module (odl_tb5.ko)
| +-- odl_tb5_service.c Service driver registration
| +-- odl_tb5_ring_dma.c NHI ring + DMA buffer management
| +-- odl_tb5_chardev.c Character device interface
| +-- odl_tb5_proto.c XDomain login/logout protocol
| +-- odl_tb5_core.h Internal kernel header
| +-- uapi/odl_tb5_uapi.h Userspace API (ioctl defs, structs)
| +-- 71-odl-tb5.rules udev rules (uaccess, NHI runtime PM)
| +-- Kbuild, Makefile
+-- lib/ Userspace library (libodl_tb5.so)
| +-- include/odl_tb5/
| | +-- odl_tb5.h Public API
| | +-- odl_tb5_types.h Shared type definitions
| | +-- odl_tb5_ioctl.h Ioctl definitions (userspace mirror)
| | +-- odl_tb5_rccl_stats.h RCCL shared-memory stats struct
| +-- src/
| +-- odl_tb5_dev.c Device open/close, mmap
| +-- odl_tb5_xfer.c Send/recv (internal + DMA-buf)
| +-- odl_tb5_peer.c Peer discovery
| +-- odl_tb5_completion.c Poll/wait completions
+-- rccl/ RCCL Net v7 plugin
| +-- src/odl_tb5_plugin.c Plugin with shared-memory stats
+-- cli/ CLI test tool
| +-- src/
| +-- odl_tb5_cli_main.c CLI entry point
| +-- odl_tb5_cli.h Protocol + test definitions
| +-- odl_tb5_cli_proto.c In-band control protocol
| +-- odl_tb5_cli_server.c Server mode
| +-- odl_tb5_cli_client.c Client mode
| +-- odl_tb5_cli_bandwidth.c Bandwidth test
| +-- odl_tb5_cli_latency.c Latency test
| +-- odl_tb5_cli_jitter.c Jitter test
| +-- odl_tb5_cli_latency_load.c Latency-under-load test
| +-- odl_tb5_cli_mimo.c MIMO (multi-stream) test
| +-- odl_tb5_cli_stats.c Statistics + histograms
+-- daemon/ System daemon (odl_tb5_daemon)
| +-- src/
| | +-- odl_tb5_daemon_main.c GMainLoop, signal handling
| | +-- odl_tb5_daemon_dbus.c/h D-Bus service (com.odinlink.Tb5Daemon)
| | +-- odl_tb5_daemon_monitor.c/h Device scan (polls /dev/odl_tb5_N)
| | +-- odl_tb5_daemon_test.c/h Test executor (GThreadPool)
| | +-- odl_tb5_daemon_rccl_stats.c/h RCCL stats reader
| | +-- odl_tb5_daemon_sync.c/h File operations engine
| | +-- odl_tb5_daemon_sync_proto.c/h File transfer wire protocol
| | +-- odl_tb5_daemon_config.c/h Config (~/.config/odl_tb5/)
| +-- dbus/com.odinlink.Tb5Daemon.xml D-Bus interface definition
| +-- data/
| +-- odl-tb5-daemon.service Systemd user unit
| +-- com.odinlink.Tb5Daemon.service D-Bus activation
+-- tray/ System tray application (odl_tb5_tray)
| +-- src/
| | +-- odl_tb5_tray_main.c GTK3 init, AppIndicator setup
| | +-- odl_tb5_tray.h Internal header
| | +-- odl_tb5_tray_dbus.c D-Bus proxy client
| | +-- odl_tb5_tray_menu.c Tray menu + callbacks
| | +-- odl_tb5_tray_peers.c Peer detail popup
| | +-- odl_tb5_tray_tests.c Test runner dialog
| | +-- odl_tb5_tray_rccl.c RCCL stats window
| | +-- odl_tb5_tray_sync.c File management UI
| +-- icons/ SVG tray icons
| +-- data/odl-tb5-tray.desktop Autostart .desktop file
+-- tests/ Unit + integration tests
| +-- odl_tb5_test_main.c
| +-- odl_tb5_test_device.c
| +-- odl_tb5_test_lib_api.c
| +-- odl_tb5_test_plugin.c
+-- packaging/ .deb packaging (CPack + DKMS)
| +-- dkms.conf, dkms-postinst.sh, dkms-prerm.sh
| +-- daemon-postinst.sh
| +-- build-meta-debs.sh.in
+-- third_party/rccl/net_v7.h RCCL Net v7 header
| Component | Ubuntu Package | Required For |
|---|---|---|
build-essential |
build-essential |
All (compiler + make) |
cmake |
cmake |
All (build system) |
linux-headers |
linux-headers-$(uname -r) |
Kernel module |
gcc-14+ |
gcc-14 |
Kernel module (must match kernel) |
pkg-config |
pkg-config |
Daemon + Tray dependency detection |
glib-2.0 |
libglib2.0-dev |
Daemon |
gio-2.0 |
libglib2.0-dev |
Daemon (D-Bus) |
gtk+-3.0 |
libgtk-3-dev |
Tray application |
ayatana-appindicator3 |
libayatana-appindicator3-dev |
Tray application (system tray icon) |
fuse3 |
libfuse3-dev |
Daemon (optional: FUSE distributed file access) |
openssl |
libssl-dev |
Daemon (optional: SHA-256 for file operations) |
# Core (always needed):
sudo apt install build-essential cmake linux-headers-$(uname -r) gcc-14 pkg-config
# Daemon:
sudo apt install libglib2.0-dev
# Tray:
sudo apt install libgtk-3-dev libayatana-appindicator3-dev
# Optional (FUSE + file operations):
sudo apt install libfuse3-dev libssl-devBuild installable packages:
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
cpack # Individual component .debs
make meta-packages # User-friendly bundles| Package | Contents |
|---|---|
odl-tb5-minimal |
dkms + library + RCCL plugin (GPU cluster node) |
odl-tb5-server |
dkms + library + CLI + daemon + RCCL plugin (headless server) |
odl-tb5-desktop |
dkms + library + CLI + daemon + tray (desktop workstation) |
odl-tb5-full |
Everything |
- Kernel module build fails with unknown GCC flags: Your GCC is too old. Install the version matching your kernel (
cat /proc/version). - Module won't load: Check
dmesg | grep odl_tb5for errors. Ensure TB5 hardware is present (lspci | grep Thunderbolt). - No
/dev/odl_tb5_*devices: The device appears only when a TB5 peer connects. Checkdmesgfor XDomain events. - Permission denied: Install the udev rule or run
sudo chmod 660 /dev/odl_tb5_*. - Daemon won't start: Check
journalctl --user -u odl-tb5-daemonfor D-Bus errors. - Tray icon not visible: Install
gnome-shell-extension-appindicatoron GNOME/Wayland desktops.
# Kernel driver debug
echo 'module odl_tb5 +p' | sudo tee /sys/kernel/debug/dynamic_debug/control
dmesg -w | grep odl_tb5
# Daemon foreground with verbose output
./build/daemon/odl_tb5_daemon -f
# RCCL debug
export RCCL_DEBUG=INFO
export RCCL_NET_PLUGIN=ODL_TB5Note: If you are having trouble getting the TB5 ports to work right I am seeing around that adding the following command line to the grub boot sequency will help.
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pcie_port_pm=off"
MIT

