LLM chatbots demo

Introduction

This repository provides hands-on examples and learning resources for working with large language models (LLMs) in local development environments.

Topics covered

Local inference with Ollama and llama.cpp
Direct model loading with HuggingFace Transformers
LangChain: prompt templates, output parsers, chains, and agents
RAG (Retrieval-Augmented Generation) with pgvector
Gradio web interfaces
Prompting techniques: zero-shot, few-shot, chain-of-thought, ReAct

Resources included

9 demos: chatbots, LangChain patterns, agents, RAG knowledge systems, fine-tuning & evaluation
8 slide decks: covering deployment, prompting, LangChain, fine-tuning, and evaluation
7 activities: hands-on exercises building on each demo

Documentation

Complete documentation: https://gperdrizet.github.io/llms-demo

The documentation covers:

Setup and installation
Demo usage and concepts
Inference server configuration
Library reference with code examples
Model specifications and serving commands
Systemd deployment for production use
Slide and activity guides

Quickstart

1. Fork and clone

Click Fork in the top-right corner of this repo on GitHub to create your own copy.

Clone your fork:

git clone https://github.com/<your-username>/llms-demo.git

2. Open in a dev container

Open the cloned folder in VS Code.
When prompted "Reopen in Container", click it - or run the command Dev Containers: Reopen in Container from the Command Palette (Ctrl+Shift+P).
VS Code will build and start the container. This takes a few minutes the first time.

3. What happens during container startup

The dev container is based on the gperdrizet/llms-gpu image (NVIDIA GPU-enabled). On first creation, the postCreateCommand runs automatically and does the following:

Step	What it does
`mkdir -p models/hugging_face && mkdir -p models/ollama`	Creates local directories for model storage
`pip install -r requirements.txt`	Installs Python dependencies: bert-score, evaluate, gradio, huggingface-hub, langchain-ollama, openai, peft, python-dotenv, trl, torch, transformers
`bash .devcontainer/install_ollama.sh`	Downloads and installs the Ollama CLI

The container also pre-configures the following:

Setting	Detail
GPU access	All host GPUs are passed through (`--gpus all`)
Python interpreter	`/usr/bin/python` is set as the default
`HF_HOME`	Points to `models/hugging_face` so Hugging Face downloads stay in the repo
`OLLAMA_MODELS`	Points to `models/ollama` so Ollama downloads stay in the repo
Port 7860	Forwarded automatically for Gradio web UIs
VS Code extensions	Python, Jupyter, Code Spell Checker, and Marp (slide viewer) are installed

Once the container is ready you can start running the demos - no extra setup needed.

Running the demos

See the Demos documentation for detailed instructions on running each chatbot, including:

Concepts covered in each demo
Tools and libraries used
Step-by-step setup and execution

Quick example - Ollama chatbot:

# 1. Start the Ollama server
ollama serve

# 2. Pull a model (in another terminal)
ollama pull qwen2.5:3b

# 3. Run the chatbot
python demos/chatbots/ollama_chatbot.py

For complete instructions on all four demos, visit the documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
.vscode		.vscode
activities		activities
demos		demos
docs		docs
slides		slides
utils		utils
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM chatbots demo

Introduction

Topics covered

Resources included

Documentation

Quickstart

1. Fork and clone

2. Open in a dev container

3. What happens during container startup

Running the demos

About

Uh oh!

Releases 4

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM chatbots demo

Introduction

Topics covered

Resources included

Documentation

Quickstart

1. Fork and clone

2. Open in a dev container

3. What happens during container startup

Running the demos

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Contributors

Uh oh!

Languages