Building A Dependency-Free GPT On A Custom OS

The construction of a large language model (LLM) depends on many things: banks of GPUs, vast reams of training data, massive amounts of power, and matrix manipulation libraries like Numpy. For models with lower requirements though, it’s possible to do away with all of that, including the software dependencies. As someone who’d already built a full operating system as a C learning project, [Ethan Zhang] was no stranger to intimidating projects, and as an exercise in minimalism, he decided to build a generative pre-trained transformer (GPT) model in the kernel space of his operating system.

As with a number of other small demonstration LLMs, this was inspired by [Andrej Karpathy]’s MicroGPT, specifically by its lack of external dependencies. The first step was to strip away every unnecessary element from MooseOS, the operating system [Ethan] had previously written, including the GUI, most drivers, and the filesystem. All that’s left is the kernel, and KernelGPT runs on this. To get around the lack of a filesystem, the training data was converted into a header to keep it in memory — at only 32,000 words, this was no problem. Like the original MicroGPT, this is trained on a list of names, and predicts new names. Due to some hardware issues, [Ethan] hasn’t yet been able to test this on a physical computer, but it does work in QEMU.

It’s quite impressive to see such a complex piece of software written solely in C, running directly on hardware; for a project which takes the same starting point and goes in the opposite direction, check out this browser-based implementation of MicroGPT. For more on the math behind GPTs, check out this visualization.

Continue reading “Building A Dependency-Free GPT On A Custom OS”

C Project Turns Into Full-Fledged OS

While some of us may have learned C in order to interact with embedded electronics or deep with computing hardware of some sort, others learn C for the challenge alone. Compared to newer languages like Python there’s a lot that C leaves up to the programmer that can be incredibly daunting. At the beginning of the year [Ethan] set out with a goal of learning C for its own sake and ended up with a working operating system from scratch programmed in not only C but Assembly as well.

[Ethan] calls his project Moderate Overdose of System Eccentricity, or MooseOS. Original programming and testing was done in QEMU on a Mac where he was able to build all of the core components of the operating system one-by-one including a kernel, a basic filesystem, and drivers for PS/2 peripherals as well as 320×200 VGA video. It also includes a dock-based GUI with design cues from operating systems like Macintosh System 1. From that GUI users can launch a few applications, from a text editor, a file explorer, or a terminal. There’s plenty of additional information about this OS on his GitHub page as well as a separate blog post.

The project didn’t stay confined to the QEMU virtual machine either. A friend of his was throwing away a 2009-era desktop which [Ethan] quickly grabbed to test his operating system on bare metal. There was just one fault that the real hardware threw that QEMU never did, but with a bit of troubleshooting it was able to run. He also notes that this was inspired by a wiki called OSDev which, although a bit dated now, is a great place to go to learn about the fundamentals of operating systems. We’d also recommend checking out this project that performs a similar task but on the RISC-V instruction set instead.