Optimizing Reasoning LLM Deployment on Edge GPUs
-
Updated
Oct 21, 2025 - Python
Optimizing Reasoning LLM Deployment on Edge GPUs
SAP Warehouse Copilot for Reachy Mini — NVIDIA NIM + Riva Speech AI + SAP OData
A high-performance, modular AI chat solution for Jetson™ edge devices. It integrates Ollama with the Meta Llama 3.2 3B model for LLM inference, FastAPI-based Langchain middleware, and OpenWebUI.
Enable voice-controlled SAP inventory management using Reachy Mini, powered by NVIDIA AI for real-time stock and order updates.
Add a description, image, and links to the nvida-jetson topic page so that developers can more easily learn about it.
To associate your repository with the nvida-jetson topic, visit your repo's landing page and select "manage topics."