Skip to content

tobert/halfremembered-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎵 halfremembered-models

🤖 AI Generated Codebase

This project was entirely designed and implemented by Claude 4.5 Opus, Claude 4.5 Sonnet, and Gemini 3.0 Pro Preview, acting as autonomous software engineering agents under human supervision.

ML model services for the halfremembered agentic music production system.

Each service = one process, one model, one bespoke API.

Note: This repo is hand-crafted for a specific AMD ROCm setup on Arch Linux. If you're running different hardware, you'll need to adapt the PyTorch/ROCm configuration.

Architecture

┌─────────────────────────────────────────────────────────┐
│  Your Application / MCP Server / CLI                    │
│  - Orchestrates model calls                             │
│  - Combines outputs (MIDI → audio, etc.)                │
└──────────────────────┬──────────────────────────────────┘
                       │ HTTP (localhost:200x)
                       ▼
┌─────────────────────────────────────────────────────────┐
│  Model Services (Python/FastAPI)       ← this repo      │
│  - One process per model                                │
│  - Independent venvs (uv)                               │
│  - Bespoke APIs per model                               │
│  - Ports 2000-2099                                      │
│  - Managed via systemd user units                       │
└─────────────────────────────────────────────────────────┘

Each service is standalone - start what you need, ignore the rest.

Quick Start

# Install just (task runner)
# Arch: pacman -S just
# Mac: brew install just

# Set up a service
just sync clap

# Run a service
just run clap

# Check all services
just status-all

# Run tests
just test clap

Services

Port Service Model Description
2000 orpheus-base asigalov61/Orpheus MIDI generation (480M, fp16)
2001 orpheus-classifier asigalov61/Orpheus Human vs AI classification
2002 orpheus-bridge asigalov61/Orpheus Cross-section bridging
2003 orpheus-loops asigalov61/Orpheus Loop generation
2004 orpheus-children asigalov61/Orpheus Children's music
2005 orpheus-mono asigalov61/Orpheus Monophonic melodies
2006 musicgen facebook/musicgen-medium Text-to-music (1.5B)
2007 clap laion/larger_clap_music Audio-text embeddings (512d)
2008 yue m-a-p/YuE-s1-7B + s2-1B Lyrics to song with vocals
2010 audioldm2 cvssp/audioldm2 Text-to-audio diffusion
2011 anticipatory stanford-crfm/music-medium Anticipatory music (800M)
2012 beat-this CPJKU/beat-this Beat/downbeat detection
2013 demucs facebook/demucs Audio source separation (stems)
2020 (external) llama.cpp OpenAI-compatible LLM API
2099 observer (uses llama.cpp) GPU/system observability

Hardware

This repo is developed on:

  • CPU: AMD Ryzen AI MAX+ 395 (32 CUs)
  • GPU: AMD Radeon 8060S (gfx1151, RDNA 3.5, 40 CUs)
  • VRAM: 96GB unified memory (shared CPU/GPU)
  • Memory bandwidth: ~240 GB/s (the bottleneck for LLM inference)
  • OS: Arch Linux (rolling release)
  • Python: 3.13 (via python313 package)
  • ROCm: 7.10.0 via official AMD wheels
  • PyTorch: 2.9.1+rocm7.10.0 from repo.amd.com/rocm/whl/gfx1151/

Key constraint: Memory bandwidth limits 7B model inference to ~13-17 tok/s regardless of GPU utilization.

ROCm/uv Setup Note

The AMD gfx1151 wheel index distributes the rocm metapackage as a source tarball. uv's resolver struggles with its transitive dependencies. We work around this with override-dependencies in each service's pyproject.toml:

[tool.uv]
override-dependencies = [
    "rocm==7.10.0",
    "rocm-sdk-core==7.10.0",
    "rocm-sdk-libraries-gfx1151==7.10.0",
]

See CLAUDE.md for full pyproject.toml configuration.

Project Structure

halfremembered-models/
├── hrserve/                 # Shared serving library
│   ├── pyproject.toml
│   ├── hrserve/
│   │   ├── model_base.py    # ModelAPI base class
│   │   ├── audio_utils.py   # Audio encoding
│   │   ├── midi_utils.py    # MIDI encoding
│   │   └── ...
│   └── tests/
│
├── services/
│   ├── clap/                # Each service is self-contained
│   │   ├── pyproject.toml   # Own deps, hrserve as editable
│   │   ├── api.py           # LitAPI implementation
│   │   ├── server.py        # Bootstrap
│   │   └── tests/
│   ├── orpheus-base/
│   ├── musicgen/
│   └── ...
│
├── systemd/                 # Service units
├── justfile                 # Task runner
└── CLAUDE.md               # Agent instructions

Service API

Each service exposes:

  • POST /predict - Model inference (JSON in, JSON out)
  • GET /health - Returns {"status": "ok"} when ready

Services are independent - call them directly via HTTP.

License & Attribution

Code

The source code in this repository is released under the MIT License. See LICENSE for details.

Model Weights

This repository contains code to run various ML models. The model weights themselves are subject to their own licenses (e.g., CC-BY-NC, Apache 2.0, Meta Research License).

  • Orpheus Models: Custom trained, CC-BY-NC 4.0.
  • MusicGen: Meta/Facebook Research (CC-BY-NC 4.0 / MIT).
  • CLAP: LAION (Apache 2.0 / MIT).
  • Demucs: Meta/Facebook Research (MIT).
  • YuE: Open source (Apache 2.0).
  • Qwen2.5-VL: Alibaba Cloud (Apache 2.0).

Please consult the individual service directories or original model repositories for specific weight licensing.

AI Attribution

This codebase was generated by Claude 4.5 Opus, Claude 4.5 Sonnet, and Gemini 3.0 Pro Preview, acting as autonomous software engineering agents under human supervision.

About

ML model services for agentic music production - AMD ROCm / Arch Linux

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors