A self-directed curriculum for learning C, structured in phases. Each phase covers the concepts needed to tackle a project slightly beyond current ability, forcing learn as you go. Concepts are introduced before the project that needs them, but kept brief - the project is where understanding solidifies.
Planned collaboratively with AI assistance and updated as I progress through and review each phase.
- Phase 1 - Foundations
- Phase 2 - Data Structures and the C Idiom
- Phase 3 - Systems Thinking
- Phase 4 - Low-Level Numerics and Performance
- Phase 5 - Neural Network from Scratch
- Standalone - A Linear Programming Engine
- Phase 6 - Interpreter / Compiler
- Phase 7 - Science Simulator (Open-ended)
Goal: Get comfortable enough with C syntax, memory, and tooling to build something small but real.
- Compilation model: preprocessor → compiler → linker,
gcc/clang,Makefilebasics - Primitive types, operators, control flow
- Functions, scope, stack frames (mental model, not assembly)
- Arrays and pointer arithmetic (the core of Phase 1)
- Pointers to pointers, pointer-to-struct patterns
- Strings as
chararrays,<string.h> - Structs and basic typedef patterns
- Manual memory:
malloc,free,calloc, heap vs stack - Dynamic arrays: geometric growth with
realloc - Header files, multi-file projects,
#includeguards - File I/O:
fopen,fread,fwrite,fclose <stdio.h>,<stdlib.h>,<string.h>(the everyday headers)- Basic debugging:
gdb,valgrindfor memory leaks - Sanitizer flags:
-fsanitize=address(ASan),-fsanitize=undefined(UBSan) - run these during development on every project
- C Programming Full Course for free (for setup and basics)
- The C Programming Language (Kernighan & Ritchie): read chapters 1–6 alongside the project, not before
Parse a subset of JSON (strings, numbers, booleans, null, nested objects, arrays) from a file into a C struct tree, then pretty-print it back.
Why: Forces handling of strings, structs, recursive data structures, and manual memory (all in Phase 1 territory). A real problem with a clear correctness signal.
Stretch: Add a query interface supporting dot-separated path traversal across nested objects and arrays (e.g. json_get_path(root, "users.0.name") returns the matching node).
Completed Example: cjson
Goal: Build common data structures that can be taken for granted in high-level languages. Own memory at a deeper level.
- Linked lists, doubly linked lists
- Hash maps: open addressing or chaining (your choice)
- Generic programming via
void *: Handling data without knowing its type at compile-time. - Function pointers: syntax and use cases
#definemacros vsstatic inlinefunctions- Error handling patterns in C (return codes,
errno) - Bit manipulation basics
- The C Programming Language (Kernighan & Ritchie): read chapters 6–8
- Hacking: The Art of Exploitation (Jon Erickson): chapters 1–2 (optional, great for memory intuition)
Build a reusable C library providing fundamental data structures (Vector, HashMap, LinkedList) designed to handle generic data using void *.
Why: In C, you don't get a standard collections library; you have to build your own. This project forces you to master generic programming (managing item_size and memory offsets), function pointers (for custom comparators and destructors), and API design. It transitions you from writing "programs" to writing "tools" that you will actually import and use in every subsequent phase of this curriculum.
Stretch: Implement a Region-based (Arena) Allocator. Instead of hundreds of individual malloc calls, pre-allocate a large block of memory and serve your data structures from it. This provides a massive performance boost and simplifies memory management; freeing the entire Arena at once avoids complex "deep-freeing" of nested structures.
Goal: Apply Phase 2 data structure skills in a networked context before moving to full OS interaction in Phase 3.
- TCP sockets:
socket,bind,listen,accept,send,recv - The HTTP/1.1 request/response format
- Parsing raw byte streams
- Dynamic string buffers for request and response construction
- Basic file serving: reading files and writing them to a socket
- Computer Systems: A Programmer's Perspective (CS:APP): chapter 11 (network programming)
- RFC 7230 (HTTP/1.1 message syntax): skim the relevant sections, not cover to cover
- Linux
manpages for socket APIs
Accept TCP connections, parse raw HTTP GET requests, serve static files from a directory, return correct status codes (200, 404, 405).
Why: Requires sockets, string parsing, and dynamic buffers; a natural extension of Phase 2 skills into a networked context. Satisfying to test: you can curl it or open it in a browser.
Stretch: Handle multiple concurrent connections with fork() or select().
Goal: Understand how C sits on top of the OS. Build something that interacts directly with the system.
- Process model:
fork,exec,wait,pipe - File descriptors,
dup2, redirection - Signals:
SIGINT,SIGCHLD,signal() - Environment variables
mmapand memory-mapped files (intro level)- Makefiles - proper multi-target builds, dependency tracking
- Address space layout: text, data, BSS, heap, stack
- Computer Systems: A Programmer's Perspective (CS:APP): chapters 8 (exceptional control flow) and 10 (system-level I/O)
- Linux
manpages
Support command execution, | pipes, </> redirection, background jobs (&), built-ins (cd, exit, export).
Why: Directly exercises the process model, file descriptors, and signal handling. Classic systems project for good reason.
Stretch: Job control (fg, bg, jobs), history with arrow keys via termios.
Goal: Learn how C handles numbers at the hardware level - essential groundwork for the neural network.
- IEEE 754 floating point - representation, precision, NaN, inf
floatvsdouble- when each matters- SIMD intuition (won't write intrinsics yet, but understand what the compiler can do)
- Cache locality - row-major vs column-major access patterns, why it matters for matrix ops
- Profiling:
gprof,perf, or justclock()- profile before optimising, always - Compiler optimisation flags:
-O0vs-O2vs-O3, and what the compiler is actually doing static,inline, andconstas optimisation hintsrestrictkeyword,constcorrectness- Fixed-size integer types:
int32_t,uint8_tetc. (<stdint.h>) - Basic linear algebra in C: matrix multiply, dot product, transpose - written by hand
- What Every Computer Scientist Should Know About Floating-Point Arithmetic (Goldberg) - skim, don't memorise
- CS:APP chapter 2 (data representation)
Implement a small matrix library: creation, addition, elementwise multiply, matrix multiply, transpose, scalar ops. Backed by flat float arrays. Include a basic benchmark comparing naive vs cache-friendly implementations.
Why: The direct precursor to the neural network. By the end I'll have the exact primitives needed, and understand why they're written the way they are.
Stretch: Add basic BLAS-style naming conventions; implement a simple softmax and sigmoid over a matrix.
Goal: Build a fully training neural network in C, using nothing but matlib and the standard library.
- Computational graph mental model (don't need to implement one - but understand it)
- Forward pass: layer types (dense/linear), activation functions (ReLU, sigmoid, softmax)
- Loss functions: MSE, cross-entropy
- Backpropagation: chain rule, gradient accumulation
- Parameter update rules: SGD, then optionally Adam
- Mini-batch training loop
- Saving/loading weights to binary files
- Numerical gradient checking (finite differences) - for verifying backprop
- Andrej Karpathy's micrograd (Python) - read the source before building in C, it's the clearest backprop reference available
- Neural Networks and Deep Learning (Nielsen, free online) - for the maths
Build a library supporting arbitrarily deep dense networks. Train it on MNIST (handwritten digits - simple binary format, easy to load in C). Target: >97% test accuracy.
Why: MNIST is the right target - hard enough to need real backprop, simple enough that debugging is tractable. The binary file format means writing a data loader from scratch, reinforcing file I/O skills.
Stretch: Add a conv layer. Save weights and write a small inference-only CLI: ./cnet predict image.bin.
Goal: Build a language and solver engine for linear programming mathematics in C.
- Lexing and parsing a domain-specific language for expressing linear programs
- The simplex method: tableau representation, pivoting, basis selection, standard form
- Slack variables, dual variables, shadow prices, reduced costs
- Detecting infeasible and unbounded problems
- Numerical stability considerations in iterative matrix algorithms
- Introduction to Linear Programming (Bertsimas & Tsitsiklis): chapters 1–3 for the theory
- Understand the simplex algorithm by hand before implementing it
matlibfrom Phase 4 as the computational backend
Build a tool that lets you express a linear program in a natural language format and solves it, reporting a full solution: primal variables, objective value, slacks, duals, and reduced costs.
Why: Combines language front-end work with numerical computing. Linear programming appears throughout operations research, economics, logistics, and machine learning. Open-ended enough to keep iterating on and potentially useful to others.
Stretch: Two-phase simplex for problems without an obvious initial feasible point. Minimisation and maximisation. Sensitivity analysis. A library interface so the solver can be embedded in other projects.
Placement: After Phase 5 (matlib primitives available), before Phase 6 (warms up lexer and parser thinking without the full complexity of a general language).
Goal: Build a working interpreted language in C.
- Lexing: token types, scanning, handling whitespace/comments
- Parsing: recursive descent, operator precedence, AST node types
- Tree-walking interpreter: environments, scopes, closures
- Memory management for AST nodes and runtime objects
- Error reporting with line numbers
- Optionally: bytecode compilation + a stack-based VM
- Crafting Interpreters (Nystrom, free online) - the definitive resource. Follow the C half (Part III), not the Java half.
- Do not skip the challenges at the end of each chapter.
A complete interpreted language with: variables, control flow, functions, closures, classes, and a mark-sweep garbage collector.
Why: Crafting Interpreters guides directly to this. By Phase 6 the C fluency is there to go off-piste - modify the language, add features, make it your own.
Stretch: Compile to bytecode with an optimising pass; add a foreign function interface so C code can be called from the language.
Goal: Apply everything. Pick a simulation domain and build it properly.
- N-body gravitational simulation - particles, Euler/Verlet integration, visualised via SDL2
- Cellular automaton - Conway's Life is trivial; reaction-diffusion (Gray-Scott) is genuinely interesting
- Fluid simulation - Navier-Stokes on a grid (Eulerian), Jos Stam's stable fluids paper is the reference
- Neural-physics hybrid - train
cnetto predict trajectories of a physical system
Before building the simulator, spend a short sprint on SDL2:
- Setup: linking SDL2, creating a window and renderer
- The event loop: handling quit, keyboard, mouse events
- Drawing primitives: pixels, rectangles, lines
- Updating the display each frame (
SDL_RenderPresent) - Mapping simulation state (e.g. particle positions, grid values) to pixel coordinates
SDL2 is well-documented and the basics are achievable in a day or two - it doesn't warrant a full phase.
- Numerical integration methods (Euler, RK4)
- Spatial data structures (grid, quadtree) for performance
- Parallelism:
pthreadsorOpenMPif performance demands it
- Always compile with
-Wall -Wextra -pedantic. Fix every warning. - During development, always build with
-fsanitize=address,undefined. Strip for release builds. valgrind --leak-check=fullon every project before calling it done.- Write a
Makefilefor every project from Phase 1 onwards. - Read other people's C - the Redis source (
src/) and SQLite amalgamation are both excellent references.