Skip to content

Releases: Hekbas/Luth

v2.3.0 — core-reorg

18 Apr 17:43

Choose a tag to compare

v2.3.0 — core-reorg

Reorganized luth/source/luth/core/ from a flat 20-file folder into three semantic sub-folders. LuthTypes.h split by purpose; Math.h renamed and merged with the GLM aliases. Pure file-organization refactor — zero runtime change.

Layout

luth/source/luth/core/
├── App, EntryPoint, Version, FrameData, UUID,         # top-level lifecycle
│   EditorHooks, ProjectFile                           # (stays at root)
├── types/
│   ├── LuthTypes.h        primitives only (i8…u64, f32/f64, byte, fs)
│   ├── LuthMath.h         Vec/Mat/Quat aliases + Math:: facade + Assimp/AABB/Frustum
│   └── TypeTraits.h       IsGLMVector / IsGLMMatrix
├── diagnostics/
│   ├── Log.{h,cpp}
│   ├── LogFormatters.h    fmt + ostream<< Vec3/Mat4 (moved from LuthTypes.h)
│   └── Profiler.h
└── time/
    ├── Time.h
    └── Timer.h

Migration

  • 132 caller files rewritten via perl -pi.
  • LuthTypes.h consumers split by Vec/Mat/Quat usage: 18 → LuthMath.h, 48 → LuthTypes.h (primitives-only).
  • ostream<< Vec3/Mat4 operators relocated from LuthTypes.h to LogFormatters.h (logging concern).
  • Dead Luth::Normalize / Cross forward decls deleted (no definitions, no callers — superseded by Math::* in v2.2.0).

Both Debug + Release x64 build clean.

Issue: #81
Commits: 3 on epic/core-reorg

v2.2.0 — math-abstraction

18 Apr 17:21

Choose a tag to compare

v2.2.0 — math-abstraction

Date: 2026-04-18
Commits: 8 (on epic/math-abstraction)
Issue: #80


Overview

Second epic of the post-v2.0 architecture-review series. Built Luth::Math as a single-source facade for every glm type and function the engine uses. After this epic, only LuthTypes.h and Math.h include <glm/...>; engine and editor code use Vec*/Mat*/Quat aliases and Math::* wrappers exclusively.

Minor version bump to v2.2.0 per the ROADMAP MINOR rule (one completed epic with engineering-visible changes — single math include, modern C++20 templated constants, type-aware tolerances, latent numeric_limits::min vs lowest confusion fixed in AABB).


Sub-Tasks

# Sub-task Commit
Spec scaffold 8ed0300 docs(epic): add math-abstraction spec
A Reshape LuthTypes.h primitive layer eaeaa00 refactor(core): reshape LuthTypes.h primitive layer
B Build Luth::Math facade 7d91417 feat(core): build Luth::Math facade with constants and function wrappers
C Pass A engine — types 9c6f2b9 refactor(luth): migrate engine to Vec*/Mat* aliases (Pass A)
D Pass A editor — types 430beca refactor(luthien): migrate editor to Vec*/Mat* aliases (Pass A)
E Pass B engine — functions fc55bd8 refactor(luth): route engine math through Luth::Math facade (Pass B)
F Pass B editor — functions c9188f3 refactor(luthien): route editor math through Luth::Math facade (Pass B)
G Purge direct glm includes 1bef56b chore(core): purge direct glm includes engine-wide
H Docs + v2.2.0 + history + wrap-up chore(release): math-abstraction → v2.2.0

Type Aliases (LuthTypes.h)

Before

using Vec2 = glm::vec2;  using Vec3 = glm::vec3;  using Vec4 = glm::vec4;
using Mat3 = glm::mat3;  using Mat4 = glm::mat4;
using Quat = glm::quat;

constexpr f32 PI       = 3.14159265358979323846f;
constexpr f32 TWO_PI   = 2.0f * PI;
constexpr f32 HALF_PI  = 0.5f * PI;
constexpr f32 EPSILON  = std::numeric_limits<f32>::epsilon();
constexpr f32 FLOAT_MAX = std::numeric_limits<f32>::max();
constexpr f32 FLOAT_MIN = std::numeric_limits<f32>::min();   // smallest *positive*, almost never what callers want

After

using Vec2 = glm::vec2;  using Vec3 = glm::vec3;  using Vec4 = glm::vec4;
using IVec2 = glm::ivec2; using IVec3 = glm::ivec3; using IVec4 = glm::ivec4;
using UVec2 = glm::uvec2; using UVec3 = glm::uvec3; using UVec4 = glm::uvec4;
using Mat2 = glm::mat2;  using Mat3 = glm::mat3;  using Mat4 = glm::mat4;
using Quat = glm::quat;
// Constants block deleted — moved to Luth::Math (templated)

The 6 root-namespace constants had zero call sites when verified — dead code. Reborn under Math:: as templated inline constexpr variables. The EPSILON = numeric_limits::epsilon() was a particularly misleading name (machine epsilon ≈ 1.19e-7 vs. spatial tolerance ~1e-4). Replaced by named tolerances (SmallNumber, KindaSmallNumber) and explicit MachineEpsilon for the rare ULP-level case.


Math Facade (Math.h)

Constants

namespace Luth::Math {
    // Standard — delegate to <numbers>
    template<class T> inline constexpr T Pi    = std::numbers::pi_v<T>;
    template<class T> inline constexpr T E     = std::numbers::e_v<T>;
    template<class T> inline constexpr T Sqrt2 = std::numbers::sqrt2_v<T>;

    // Derived
    template<class T> inline constexpr T TwoPi     = T(2) * Pi<T>;
    template<class T> inline constexpr T HalfPi    = Pi<T> / T(2);
    template<class T> inline constexpr T QuarterPi = Pi<T> / T(4);
    template<class T> inline constexpr T InvPi     = T(1) / Pi<T>;
    template<class T> inline constexpr T DegToRad  = Pi<T> / T(180);
    template<class T> inline constexpr T RadToDeg  = T(180) / Pi<T>;

    // Tolerances (engine value-add)
    template<class T> inline constexpr T SmallNumber      = T(1e-8);
    template<class T> inline constexpr T KindaSmallNumber = T(1e-4);

    // Sentinels — delegate to <limits>
    template<class T> inline constexpr T FloatMax       = std::numeric_limits<T>::max();
    template<class T> inline constexpr T FloatLowest    = std::numeric_limits<T>::lowest();
    template<class T> inline constexpr T FloatMin       = std::numeric_limits<T>::min();
    template<class T> inline constexpr T MachineEpsilon = std::numeric_limits<T>::epsilon();
}

Function wrappers

25 wrappers covering every glm::* function the engine uses:

Group Functions
Transformation Translate, Rotate, Scale, Perspective, Ortho (×2), LookAt
Linear algebra Inverse, Transpose, Normalize, Length, Length2, Dot, Cross
Interpolation Mix, Slerp
Quaternion ToMat4 (covers both glm::toMat4 and glm::mat4_cast), EulerAngles, QuatLookAt, Decompose
Angle conversion Radians, Degrees
Common math Clamp, Min, Max, Abs (function templates — work on scalar + vector)
Pointer helpers ValuePtr (const + non-const), MakeVec3
Type re-exports length_t, qualifier

AABB sentinel fix

// Before
struct AABB {
    Vec3 Min = Vec3( std::numeric_limits<float>::max());
    Vec3 Max = Vec3(-std::numeric_limits<float>::max());   // works but not by name
    ...
};

// After
struct AABB {
    Vec3 Min = Vec3(Math::FloatMax<f32>);
    Vec3 Max = Vec3(Math::FloatLowest<f32>);   // explicit lowest()
    void Expand(const Vec3& p) { Min = Math::Min(Min, p); Max = Math::Max(Max, p); }
    ...
};

Both old and new code work, but the new spelling makes the intent (lowest(), not min()) impossible to misread.


Migration

Pass A — types (commits C, D)

glm::vec2/3/4, glm::ivec2/3/4, glm::uvec2/3/4, glm::mat2/3/4, glm::quatVec*/IVec*/UVec*/Mat*/Quat aliases. Word-boundaried perl in-place across 35 files (28 engine + 7 editor); LuthTypes.h excluded so the IsGLMVector<glm::vec<L,T,Q>> and IsGLMMatrix<glm::mat<C,R,T,Q>> trait specializations stay valid (their job is to detect glm types).

Pass B — functions (commits E, F)

glm::translate/rotate/scale/...Math::Translate/Rotate/Scale/.... Same files, same tool. Special cases:

  • glm::two_pi<f32>()Math::TwoPi<f32> (variable template, no parens)
  • glm::mat4_cast(quat) and glm::toMat4(quat) → unified as Math::ToMat4
  • glm::length_t, glm::qualifierMath::length_t, Math::qualifier (template-parameter usages in EditorCamera)
  • glm::length2 (unscoped from initial inventory) → Math::Length2 — surfaced during commit E, facade extended in same commit
  • glm::decompose (full TRS+skew+perspective unpack used by FrameDebuggerPanel) → Math::Decompose — facade extended in commit E for the editor consumer

Purge — includes (commit G)

#include <glm/...> removed from all 33 non-facade files. Math.h and LuthTypes.h are now the only files including glm headers. PCH carries the facade into every translation unit transparently — no consumer needs an explicit Math.h include.

LuthTypes.h operator<< formatter signatures cleaned up: const glm::vec3&const Vec3& (typedef-equivalent, just clearer style).


Final Tally

Metric Before After Delta
Files with glm:: references 37 2 (LuthTypes.h, Math.h — the facade) −35
Total glm:: references ~450 50 (all inside the facade) −400
Files with <glm/...> includes 35 2 (the facade) −33
Total <glm/...> include lines ~38 10 (facade only) −28

Key Design Decisions

Single math file vs split

Aliases stay in LuthTypes.h; functions and constants live in Math.h. PCH (luthpch.h) already pulls both, so callers get the full math layer transparently. The two-file split is conceptual (primitives vs operations) and matches the eventual E4 reorg where both move to core/types/. Doing it this way keeps E3 a clean enforcement pass — no folder moves, no file renames; E4 will move both files in one shot.

Templated inline constexpr over function form

Modern C++20 idiom (std::numbers::pi_v<T> is a variable template). Same expressiveness as the function form (glm::pi<T>()) without the trailing parens noise. Math::Pi<f32> reads cleanly at the call site and is consteval-equivalent for compile-time use.

Engine value-add: named tolerances

Math::SmallNumber<T> (1e-8) and Math::KindaSmallNumber<T> (1e-4) borrow Unreal's distinction. Spatial code rarely wants numeric_limits::epsilon() (~1.19e-7); it wants a robust tolerance for "essentially zero" or "vectors are equal enough." Naming the two cases makes intent explicit and lets future code change the values without a hunt-and-replace.

Header-only facade

All wrappers are inline one-line forwards to glm::*. No Math.cpp for E3. The compiler inlines them away (PCH-cached inline calls don't show up in the binary). E4 may revisit if folder relocation warrants .cpp files for non-inline helpers (ComposeTransform, Frustum).

Bulk-rewrite split per source tree

Pass A and Pass B each split into separate luth/luthien commits (rather than one combined Pass A or Pass B). Lets git bisect isolate which side of the engine→editor boundary introduced any regression. Adds one extra build per pass — cheap insurance.

Trait specializations stay glm-typed

IsGLMVector<glm::vec<L,T,Q>> and IsGLMMatrix<glm::mat<C,R,T,Q>> in LuthTypes.h keep glm::vec/glm::mat because detecting glm types is their purpose. Renaming the specialization body would break the abstraction at its foundation. Pass A explicitly excluded LuthTypes.h.


Build Verification

  • 8 commits on `epic/math-...
Read more

v2.1.0 — shader-asset-pipeline

18 Apr 15:50

Choose a tag to compare

v2.1.0 — shader-asset-pipeline

Date: 2026-04-18
Commits: 5 (on epic/shader-asset-pipeline)
Issue: #79


Overview

First epic of the post-v2.0 architecture-review series. Rewrote the shader asset pipeline around single-stage shader assets: each .vert, .frag, or .comp file on disk is one asset with one UUID and one SPIR-V artifact. No more .vert+.frag pairing assumption in the importer, no more runtime ShaderCompiler::Compile fallback in the renderer, no more Fragment shader not found errors on startup.

Minor version bump to v2.1.0 per the ROADMAP MINOR rule (one completed epic with user-visible changes — gone-startup error, faster launches on second run via cached SPIR-V for all 24 engine shaders).


Sub-Tasks

# Sub-task Commit
ABC Single-stage shader asset model (schema + importer + Shader class) 51d796a refactor(assets): single-stage shader assets (schema, importer, Shader class)
D RenderPipeline + IBLPrecompute migration through ShaderLibrary::LoadEngine 77f2c2f refactor(renderer): load all engine shaders through asset pipeline
E Hot-reload for any stage; remove dead RecompileUtilityShaders fallback de34495 refactor(renderer): hot-reload any shader stage, remove utility-recompile fallback
Register .meta files for compute shaders (generated on first run) d7ab811 chore(assets): register .meta files for compute shaders
F Docs + v2.1.0 + history + wrap-up chore(release): shader-asset-pipeline → v2.1.0

(ABC bundled because the schema / importer / Shader-class reshape cannot be split build-clean — see "Sub-task granularity" in the epic spec.)


Schema Change

Before (v1 shader artifact)

struct ShaderAssetData { std::vector<u32> VertexSpirV; std::vector<u32> FragmentSpirV; std::string SourcePath; };
struct ShaderHeader    { u32 VertexSpirVSize; u32 FragmentSpirVSize; };
// Artifact: [AssetHeader(v=1)][ShaderHeader{VertexSize,FragmentSize}][VertSpirV][FragSpirV]

ShaderImporter::Import(.vert) compiled both .vert + its paired .frag. .frag source files had no artifact of their own (importer returned true without writing). Vertex-only / fragment-only / compute shaders could not be assets — they were runtime-compiled inline in RenderPipeline::Initialize / RecompileUtilityShaders / InitCullPipeline / InitAOResources / InitDebugBlitResources / IBLPrecompute (≈45 call sites, no caching, every launch).

After (v2 shader artifact)

enum class ShaderStage : u32 { Unknown, Vertex, Fragment, Compute };
struct ShaderAssetData { ShaderStage Stage; std::vector<u32> SpirV; std::string SourcePath; };
struct ShaderHeader    { u32 Stage; u32 SpirVSize; };
// Artifact: [AssetHeader(v=2)][ShaderHeader{Stage,SpirVSize}][SpirV]

DeserializeShader rejects version != 2 so V1 artifacts force a re-import under the new schema on first launch after this lands.


Shader / VulkanShader Contract

// Shader (was multi-stage container; now single-stage)
class Shader : public Asset {
    virtual ShaderStage GetStage() const = 0;
    virtual const std::vector<u32>& GetSpirV() const = 0;
    virtual const fs::path& GetPath() const = 0;
    virtual bool IsValid() const = 0;
    virtual void Reload() = 0;
    static std::shared_ptr<Shader> Create(ShaderStage, const std::vector<u32>& spirv, const fs::path&);
};

// VulkanShader stores one VkShaderModule + one VkPipelineShaderStageCreateInfo.
// Removed: CompileOrGetVulkanBinaries pairing heuristics.
// Removed: old Create(vertSpv, fragSpv, path) / VulkanShader(vertSpv, fragSpv, path) overloads.

Pipeline construction in RenderPipeline::CreatePipelines is unchanged — it still consumes raw m_*Spv blobs. What changed is where those blobs come from: previously runtime ShaderCompiler::Compile, now ShaderLibrary::LoadEngine("shaders/x.ext")->GetSpirV() on the shared asset.


ShaderLibrary Changes

  • New ShaderLibrary::LoadEngine(engineRelPath) — idempotent loader + registrar. Keys library entries by filename ("pbr.vert", "gtao_main.comp"). Internally: AssetDatabase::GetUUIDAssetManager::LoadImmediateRegister.
  • Keys migrated from friendly names ("pbr", "shadowDepth") to per-stage filenames ("pbr.vert", "pbr.frag", ...).
  • Reload callback now handles all 24 engine shader filenames (graphics + compute): pulls fresh SPIR-V into the cached m_*Spv blob and rebuilds the affected pipeline(s). Compute-pipeline rebuild is inline (push-constant sizes duplicated from InitCullPipeline / InitAOResources; acceptable for now).
  • File watcher filter extended from .vert|.frag to .vert|.frag|.comp. Unmatched files now log a warning; the old m_PendingUtilityReload fallback flag + RenderPipeline::RecompileUtilityShaders() dead-path were deleted.

Scanner Change

FileSystem::GetAssetTypeFromPath now recognizes .comp. On first launch after this lands, AssetDatabase::InitEngine discovers all 8 compute shaders under luth/assets/shaders/ and generates .meta files with stable UUIDs (committed in d7ab811). Same dirty-tracking + artifact-cache path as .vert/.frag.


Call-site Migrations

Replaced every ShaderCompiler::Compile(shadersPath / "x.ext") in:

File Calls
renderer/RenderPipeline.cpp (Initialize, InitCullPipeline, InitAOResources, InitDebugBlitResources) 20
renderer/lighting/IBLPrecompute.cpp (equirect / irradiance / prefilter / brdf_lut / skybox) 6

ShaderCompiler::Compile is now only invoked by (a) ShaderImporter::Import (on asset import / hot-reload artifact refresh), and (b) VulkanShader::Reload (in-memory recompile of the one stage the shader owns).

Also dropped dead #include "luth/renderer/shader/ShaderCompiler.h" from 10 files (9 passes + RenderingSystem.cpp) that no longer call it.


Key Design Decisions

Single-stage asset = disk file

One .vert file = one asset = one UUID = one artifact. Pipelines combine stages at creation time, not at import time. This removes the implicit "vert's friend is a frag with the same stem" coupling that broke any vertex-only / compute / fragment-only shader.

V2 rejects V1 instead of silent upgrade

DeserializeShader checks header.Version != 2 and returns false, which triggers a re-import on first load. No silent format conversion, no zombie V1 data on disk. Cleaner than reading both schemas and trying to pick the right one.

PipelineManager keyed by vertex-shader UUID

GetOrCreate(shaderUUID, renderMode, cullMode, polyMode, vertSpv, fragSpv) still uses a single UUID as the cache key. Chose ShaderLibrary::Get("pbr.vert")->Handle as the canonical key for the PBR pipeline family (vs. introducing a synthetic program-UUID or hashing both). Stable across launches, consistent with the old behavior (the old "pbr" library entry WAS pbr.vert). The reload callback invalidates with the canonical key regardless of which stage edited — so a pbr.frag reload correctly drops cached PBR pipelines.

ABC bundled into one commit

Schema rewrite (A), importer rewrite (B), and Shader/VulkanShader reshape (C) are interlocked: changing one in isolation breaks the build. The epic's issue-level checklist tracks all three, but the commit is one atomic refactor. D, E, F land as separate commits since they're each build-clean on their own.

RecompileUtilityShaders deleted (no deprecation window)

After D, every engine shader is in ShaderLibrary, so the file-watcher always finds a library match and the m_PendingUtilityReload fallback never fires. Deleted along with m_PendingUtilityReload and the drain branch in RenderingSystem::Update rather than leaving dead code. Per-epic principle: no backwards-compat shims.

Compute pipeline rebuild inline in the callback

Hot-reload of a .comp shader rebuilds the matching VKComputePipeline inline in the ShaderLibrary reload callback (push-constant sizes + descriptor-layout handle copied from the Init* call sites). Could factor out a RebuildComputePipeline(name) helper — left inline for this epic; clean-up candidate for render-pipeline-split (E6 in the review plan).


Build Verification

  • 5 work commits on epic/shader-asset-pipeline; every commit builds Debug x64 clean (MSBuild /v:minimal reports zero errors; only pre-existing warnings — C4267/C4244/C4996/LNK4006).
  • Full 3-project solution (Luth, Luthien, Runtime) builds unchanged.
  • grep ShaderCompiler::Compile — only 2 legitimate call sites remain (ShaderImporter::Import, VulkanShader::Reload).
  • grep "VertexSpirV\|FragmentSpirV" — zero in source (only in history docs).

Runtime Verification (user smoke test)

  • Delete luth/Library/Artifacts/ → relaunch: every .vert/.frag/.comp imports once, no Fragment shader not found error.
  • Second launch: no SPIR-V recompilation (artifact mtimes unchanged).
  • Full render: PBR + shadows + GTAO + bloom + skybox + outline + grid + ImGui visually identical.
  • Frame Debugger captures + replays a frame.
  • Skinned mesh renders with shadows.
  • Edit depthPrepass.vert → hot-reload fires → depth pass rebuilds.
  • Edit gtao_main.comp → compute pipeline rebuilds; AO still renders.

Lessons

Pairing assumptions leak into asset schemas. The root cause of the Fragment shader not found error wasn't the importer's check — it was that the importer had ever been modeled around graphics-pipeline topology in the first place. Fix at the asset-model level, not the importer error path.

Atomic commits don't always mean one sub-task per commit. The issue body split the refactor into A/B/C so the scope tracking stays granular on GitHub, bu...

Read more

v2.0.0 — arch-target-split

18 Apr 13:03

Choose a tag to compare

v2.0.0 — arch-target-split

Date: 2026-04-18
Commits: 5
Issue: #78


Overview

Phase 5 of the architecture refactor — the final arch-* epic. Extracted ~12 071 LOC of editor code from Luth.lib into a new Luthien.lib static library, renamed the luthien/ exe folder to runtime/, and broke the engine→editor include dependency via an IEditorHooks interface. After this epic, Luth.lib has zero luthien/... includes and the one-way-dependency invariant is enforced by a git grep check.

Major version bump to v2.0.0 per the ROADMAP versioning rule (fundamental architecture change).

See the multi-epic plan: docs/development/ARCH-REFACTOR-PLAN.md.


Sub-Tasks

# Sub-task Commit
A Rename luthien/ exe folder to runtime/ refactor(build): rename luthien exe folder to runtime
B Extract editor into Luthien.lib + introduce IEditorHooks refactor(build): extract editor into Luthien.lib
C Untrack editor-state files chore(repo): untrack editor state files
D Regen + docs + v2.0.0 + history + release chore(build): finalize target split, bump v2.0.0

Directory Changes

New folders

  • luthien/source/luthien/ — editor code (was luth/source/luth/editor/). Mirrors luth/source/luth/.

Renamed

  • luthien/ exe folder → runtime/ (git mv; LuthienApp.cpp, Luthien.rc, icons/, resource.h histories preserved)
  • luth/source/luth/editor/**luthien/source/luthien/** (git mv; 30+ files + subtrees)

New files

  • luth/source/luth/core/EditorHooks.{h,cpp}IEditorHooks interface + EditorHooks::Register/Get
  • luthien/premake5.lua — new Luthien.lib static-lib project
  • luthien/source/lepch.{h,cpp} — editor PCH (includes luthpch.h + ImGui + Vulkan)
  • luthien/source/luthien/Bootstrap.h — declares InstallLuthienEditorHooks()
  • luthien/source/luthien/EditorHooks.cppLuthienEditorHooks impl forwarding to Editor::* / ProjectLauncher::* / EditorSelection::*

Bulk rewrites

  • 142 #include "luth/editor/..."#include "luthien/..." across 45 files (perl + binmode for CRLF preservation)
  • 26 editor .cpp files: #include "luthpch.h"#include "lepch.h" (new editor PCH)

Target layout after the epic

Target Kind Links Contents
Luth.lib StaticLib Engine only; no editor/panel code
Luthien.lib StaticLib Luth Editor: panels, inspectors, commands, style, widgets, hook impl
Luthien.exe ConsoleApp Luth, Luthien Editor application (runtime/Runtime project, targetname Luthien)

Untracked

  • runtime/editor_settings.json, runtime/imgui.ini, samples/editor_settings.json, samples/cache/pipeline.bin (git rm --cached + new .gitignore patterns editor_settings.json and samples/cache/)

Key Design Decisions

IEditorHooks instead of a full API redesign

The structural split surfaced deep coupling: App.cpp drove the editor's per-frame lifecycle via 14 direct Editor::* / ProjectLauncher::* / EditorSelection::* call sites; Input.cpp queried Editor::WantCaptureKeyboard/Mouse; Luth.h (public umbrella) included Editor.h. A full API redesign (virtual App hooks, event bus, RHI layer) would have tripled scope.

Compromise: a minimal nullptr-safe IEditorHooks interface in luth/core/EditorHooks.{h,cpp} with 18 virtual methods covering the exact call-site set. LuthienEditorHooks (in Luthien.lib) forwards each call. Registration runs in runtime/LuthienApp.cpp::CreateApp before App::App() constructs, so the hook is live from the first EditorHooks::Get() call onward. A runtime-only host that skips linking Luthien.lib leaves the registry empty and every engine-side call nullptr-checks cleanly to a no-op.

EditorViewportState snapshot instead of per-getter dispatch

App::Run's per-frame block was building CameraParams from Editor::GetPanel<ScenePanel>()->GetEditorCamera().GetViewMatrix() + 6 more getters. Putting each behind a virtual call would cost 10+ dispatches per frame. Replaced with a single IEditorHooks::GetViewportState(EditorViewportState&) that fills a POD (view/proj/pos/near/far/IBL/selection) in one roundtrip. Engine builds CameraParams from it.

Sandbox.exe descoped

Issue #78 sub-task D requested a Sandbox.exe target. Earlier experiments had one and it was removed as clutter. The structural goal (prove Luth.lib ships without editor) is enforced more cheaply:

  • Physical: after B, Luth.lib's files { "source/**" } glob excludes luthien/, so editor .obj cannot link in
  • Invariant: git grep -l 'luthien/\|Luthien' luth/source → zero (gated in D)

A future player/standalone harness can be added when there's a real consumer.

Layout luthien/source/luthien/ (mirrors luth/source/luth/)

Alternative was editor/source/luthien/ from ARCH-REFACTOR-PLAN.md. Chose the flatter form for symmetry with the engine layout — one less nesting level, and the Luthien brand is visible at repo root.

Sub-task order reversed from issue #78

Issue ordered (A) extract editor, (C) rename luthien/runtime/. Flipped to (A) rename first, (B) move editor in. The rename is low-risk (pure git mv + premake tweak); doing it first frees the luthien/ folder name, then the high-risk ~12k-LOC move lands in an empty target. Risk-staging improved.

VS project-name + binary-name split

New Luthien.lib wanted project "Luthien". Old Luthien.exe also had that project name. Collision resolved: exe project renamed to "Runtime" with targetname "Luthien" so the output binary stays Luthien.exe. startproject "Runtime". CI artifact paths in .github/workflows/build.yml updated bin/.../Luthien/bin/.../Runtime/.

Circular static-lib dependency avoided

Before the hooks interface, the compile-clean intermediate state had Luth.lib referencing Editor:: symbols and Luthien.lib referencing Luth:: symbols — a cyclic static-lib dependency that MSVC's multi-pass linker handles at exe link time. After IEditorHooks, the cycle is gone: Luth.lib has no unresolved editor symbols, Luthien.lib depends on Luth, Luthien.exe links both. Clean one-way.

imgui refs in luth/source are legitimate engine infrastructure

The initial spec's D-verify included git grep -l 'imgui' luth/source → zero matches. This check was based on a misunderstanding. The engine legitimately uses ImGui in its render pipeline:

  • renderer/passes/ImGuiPass.cpp — render-graph pass that composites ImGui draw data
  • renderer/RenderPipeline.cpp — wires the ImGui pass into the graph
  • renderer/FrameDebugger.cpp — capture-time UI
  • renderer/rendergraph/ArchivedImage.{h,cpp} — archive-image UI
  • platform/WinWindow.cpp — GLFW↔ImGui event bridging
  • scene/systems/RenderingSystem.cpp — adds the ImGui pass

These are engine-level ImGui integrations, not editor code. The correct engine-cleanness invariant is luthien/|Luthien grep returning zero, not ImGui absence.


Shrinkage / Measurement

Editor LOC moved: ~12 071 (from ARCH-REFACTOR-PLAN.md's pre-epic tally). New engine-side code: ~120 LOC (EditorHooks.h/.cpp + Bootstrap.h + LuthienEditorHooks.cpp impl).

Post-split binary sizes (Debug x64):

Binary Size Notes
Luth.lib 294 MB Engine only; debug symbols bloat it
Luthien.lib 141 MB Editor + ImGui + ImGuizmo glue
Luthien.exe 18 MB Runtime binary linking both libs

Pre-epic Luth.lib baseline not captured — measurement deferred (would require a pre-epic rebuild).


Lessons

Editor-extraction surfaced engine→editor coupling. App.cpp had 14 direct editor calls driving the editor's per-frame lifecycle; Input.cpp queried ImGui-capture state; Luth.h pulled Editor.h. The physical file move alone left this at include level. Adding IEditorHooks + nullptr-safe dispatch broke it cleanly with minimal API surface. A virtual-method App redesign would have been 5× the work — the hook interface is the right scope for a structural epic.

Git rename threshold can miss small files. Command.h had 10 lines total, 5 of which were #include "luth/editor/commands/..." lines that the perl rewrite changed. That's a 50%+ edit by git's default similarity metric, so git showed the move as delete + create instead of a rename. Larger files with the same 5-line change detect cleanly. --find-renames=30% on git log / git diff lowers the threshold if needed.

Bulk rewrites with perl + binmode. Same recipe as arch-cleanup: perl -i -pe 'BEGIN{binmode(ARGV);binmode(STDOUT);} s|old|new|g' preserves CRLF on Windows. 168 rewrites in this epic (142 include paths + 26 PCH switches) all clean — no stray LF/CRLF flips in git diff.

ImGui is engine infrastructure, not editor-exclusive. The spec's initial assumption was that ImGui only lives in editor code. Real dependency is in 6 engine files across rendergraph, frame debugger, and platform. The correct editor-cleanness invariant is luthien/|Luthien grep, not imgui grep.

Preserved a functioning editor throughout. Every commit landed build-clean with full editor parity. No multi-commit broken state — sub-staging (B1/B2/B3 contingency) wasn't needed because the hooks interface was designed before any engine code was rewritten, keeping each commit atomic.


Build Verification

  • 4 work commits (plus the kickoff docs(epic): add arch-target-split spec)
  • All sub-tasks build Debug x64 clean at HEAD; warnings are pre-existing noise (C4267 size_t→uint32_t, C4244 chrono::rep, C4996 getenv, LNK4006 Vulkan import-descriptor duplicates)
  • 3-...
Read more

v1.7.0 — arch-renderer-split

18 Apr 03:38

Choose a tag to compare

v1.7.0 — arch-renderer-split

Date: 2026-04-18
Commits: 9
Issue: #77


Overview

Phase 3–4 of the architecture refactor. Dissolve the ~3 500-LOC RenderingSystem god-class (which lived in scene/systems/ but was the de-facto renderer) into focused classes under renderer/. Consolidate scattered animation code into a new animation/ module.

After this epic, scene/systems/RenderingSystem is a ~350-LOC ECS glue layer; all graphics resources (pipelines, descriptor sets, SPIR-V, UBOs, SSBOs, IBL maps, bloom textures, GPU timers, named-texture registry, preview textures) live on RenderPipeline in renderer/; animation data has its own top-level module.

See the multi-epic plan: docs/development/ARCH-REFACTOR-PLAN.md.


Sub-Tasks

# Sub-task Commit
A Extract FrameTargets from RenderingSystem refactor(render): extract FrameTargets from RenderingSystem
B Extract DrawListBuilder refactor(render): extract DrawListBuilder from RenderingSystem
C Extract LightGatherer + CascadeBuilder refactor(render): extract light gathering + CSM cascade build
D Extract RenderPipeline (graph assembly) refactor(render): extract RenderPipeline graph assembly
E1 Thin RenderingSystem — init migration refactor(scene): thin RenderingSystem (init migration)
E2 Thin RenderingSystem — per-frame + debug migration refactor(scene): thin RenderingSystem (per-frame + debug migration)
E3 Thin RenderingSystem — field ownership migration refactor(scene): thin RenderingSystem (field ownership migration)
Fix skybox reload on project change fix(editor): reload skybox on project change
F Consolidate animation/ module refactor(animation): consolidate animation module

Directory Changes

New files

  • renderer/FrameTargets.{h,cpp} — owns persistent scene textures (SceneColor/Depth, LDR, EntityID, Selection {mask,depth})
  • renderer/DrawListBuilder.{h,cpp} — walks ECS once, partitions entities into opaque/cutout/transparent draw buckets
  • renderer/draw/DrawList.h — bucket struct with tri-count summary
  • renderer/lighting/LightGatherer.{h,cpp} — ECS → LightUniforms + shadow config
  • renderer/lighting/CascadeBuilder.{h,cpp} — PSSM split + per-cascade ortho fit
  • renderer/RenderPipeline.{h,cpp} — graph assembly + all graphics resources (~3 150 LOC)

New folder

  • animation/ — houses AnimationClip.h, Skeleton.h, BoneMatrixBuffer.{h,cpp}, AnimationController.h (all moved via git mv, history preserved)

Moved

  • 5 files moved into animation/ (from renderer/ and scene/)
  • 19 caller files bulk-rewrote their includes

Added to lighting/LightTypes.h

  • DirectionalLightShadowParams — per-frame shadow config from Component::DirectionalLight
  • CascadeData — per-frame CSM output (4 matrices + split view-Z + texel sizes)

Shrinkage

File Before After Δ
scene/systems/RenderingSystem.cpp ~3 500 LOC 348 LOC −90%
scene/systems/RenderingSystem.h ~490 LOC 194 LOC −60%

The remaining RenderingSystem is the ECS glue layer the spec targeted: Update() orchestration, UpdateLightUniforms() (CPU-side gather + cascade build), mouse picking, editor-facing getters/setters, frame-debugger state, shader hot-reload dispatch. The ~100-LOC aspirational target was approached but not hit — RenderingSystem still holds FrameTargets, DrawListBuilder, LightGatherer, CascadeBuilder, FrameDebugger, CameraParams, and editor state, because those are all scene-level inputs per the spec's target shape.


Key Design Decisions

Bidirectional friend class kept

friend class RenderPipeline; on RenderingSystem allows RenderPipeline methods to read RS-side per-frame inputs (CameraParams, ShadowParams, Cascades, FrameTargets, FrameDebugger, DrawList, editor toggles) without widening the public API to ~25 getters. The coupling is intrinsic: Pipeline consumes scene state that by design lives on the ECS-glue layer. Dropping friend was an aspirational goal, not worth the verbosity. RenderingSystem.h drops most Vulkan includes as a result — only VkSampler (via FrameDebugger) and VkImageView (preview getter return types) remain.

Sub-task D staging

Moving the entire graph-assembly chain + all 13 Add*Pass methods + CollectSelectedHandles + CaptureSnapshot was a 1 500-LOC mechanical move. Executed atomically via perl rewrite: RenderingSystem::RenderPipeline:: on class qualifiers, m_Xm_System.m_X on member accesses (since fields still lived on RS at that point). Friend class granted access. E1–E3 later inverted the perl rewrite for fields that migrated.

Sub-task E sub-staged into E1/E2/E3

The "≤ 100 LOC" target in the spec required ~2 000 LOC of migration across 4 files — too risky for a single commit. Split into three atomic sub-commits:

  • E1 — init routines + ctor/dtor → Pipeline::Initialize/Shutdown/OnResize. RenderingSystem.cpp: 3 073 → 1 194 LOC.
  • E2 — per-frame Update* helpers + BuildGPUObjectBuffer + debug blit/preview helpers moved. 1 194 → 324 LOC.
  • E3 — field ownership migration. ~40 graphics fields moved from RS to Pipeline; pass files bulk-rewrote m_System.m_Xm_X. RS-retained fields kept their m_System. prefix via negative-lookbehind perl.

k_MaxGPUObjects + indirect-region constants moved

The static constexpr u32 k_MaxGPUObjects = 4096; constant (plus k_IndirectRegionCount / k_IndirectRegionStride) migrated from RenderingSystem:: private statics to RenderPipeline:: public statics in E3. Pass files consume them via RenderPipeline::k_MaxGPUObjects.

animation/ module carved out (sub-task F)

AnimationClip, Skeleton, BoneMatrixBuffer (data + GPU buffer) moved from renderer/. AnimationController (blend controller) moved from scene/. AnimationSystem stayed in scene/systems/ because it walks Component::Animation + Component::BoneAttachment — ECS territory by design.


Skybox Init Bugfix

Partway through E3 verification, the skybox rendered black at startup until the user manually reloaded it via the editor. Root cause: RenderingSystem::ctor runs during App::App before any project loads. FileSystem::ResolveAsset("textures/environment.hdr") falls back to engine assets (s_HasProject = false), but the engine ships no HDR — only samples/assets/textures/environment.hdr exists. IBL::Precompute hits its fallback path, returns 1×1 placeholders without skybox SPIR-V, and CreatePipelines silently skips the skybox pipeline. Likely pre-existing but surfaced during refactor verification.

Fix: Editor::OnProjectChanged (tail of App::LoadProject) now re-resolves the settings' skyboxPath against the freshly set project root and calls ReloadSkybox if the resolved path exists.


Lessons

Scope ambition vs. commit granularity. The spec's "≤ 100 LOC" target for RenderingSystem.cpp was the right aspiration but couldn't be delivered atomically. The epic spec itself suggested sub-staging D into D1/D2/D3 if needed; E adopted the same pattern. Per-commit build verification is non-negotiable; the single-commit target would have been a multi-day breakage risk.

Perl lookbehind saves double-rewrites. Each sub-task's bulk rewrite had to avoid re-rewriting already-prefixed accesses. (?<!m_System\.)\bm_X\b is the pattern — it runs idempotently over a file that's been partially rewritten, so repeat runs are safe.

Refactor verification surfaces pre-existing bugs. The skybox-black issue likely existed before the epic — the refactor just put eyes on it. Worth a post-epic pass on anything that looks "working" but might have a similar latent flaw.

File moves with git mv preserve blame. Sub-task F's 5-file move kept git log --follow history intact; the diff showed 0-line changes on the moved files themselves. Trivial but important for ongoing code archaeology.


Build Verification

  • Debug x64 builds clean after every sub-task (9 incremental builds)
  • Premake regeneration clean on each sub-task
  • No new warnings (only pre-existing C4267 / LNK4006 noise)
  • Luth.lib + Luthien.exe artifacts produced

Runtime verification (user-confirmed)

  • A: FrameTargets resize + render-pass parity ✅
  • B: tri-count + opaque/cutout/transparent ordering preserved ✅
  • C: CSM cascades split identically ✅
  • D: full visual parity + Frame Debugger works ✅
  • E1/E2/E3: startup, shutdown, resize, hot-reload, picking, capture ✅
  • F: skinned mesh animation + bone debug overlay ✅
  • Skybox fix: loads correctly when project opens ✅

v1.6.0 — arch-cleanup

18 Apr 00:36

Choose a tag to compare

v1.6.0 — arch-cleanup

Date: 2026-04-18
Commits: 8
Issue: #76


Overview

Phase 1–2 of the architecture refactor. Low-risk mechanical moves that clean up folder misalignment, preparing the tree for the larger arch-renderer-split and arch-target-split epics. No behavior change — all work is structural.

See the multi-epic plan: docs/development/ARCH-REFACTOR-PLAN.md.


Sub-Tasks

# Sub-task Commit
A Extract events/ from platform/ refactor(events): extract event types from platform/
B Disperse utils/ into editor/core/resources refactor(utils): disperse utils/ into editor/core/resources
C Move FrameData.h from renderer to core refactor(core): move FrameData from renderer to core
D Rename SystemsSystemRegistry, fix ownership refactor(scene): rename Systems->SystemRegistry, fix ownership
E Split Components.h into components/ subfolder refactor(scene): split Components.h into components/ subfolder
F Normalize POD component field naming refactor(scene): normalize POD component field naming
G Subdivide renderer/ into concept folders refactor(render): subdivide renderer/ into concept folders
H Extract LightTypes.h from RenderingSystem refactor(render): extract LightTypes from RenderingSystem

Directory Changes

New folders

  • events/ — extracted from platform/
  • editor/widgets/ — from utils/ (Icons, ImGuiUtils)
  • scene/components/ — granular component headers
  • renderer/resources/ — Buffer, Mesh, Model, Texture
  • renderer/material/ — Material, MaterialSystem
  • renderer/shader/ — Shader, ShaderCompiler, ShaderLibrary
  • renderer/pipeline/ — PipelineManager
  • renderer/lighting/ — IBLPrecompute, LightTypes (new)
  • renderer/settings/ — GTAOSettings, PostProcessSettings
  • renderer/draw/ — DrawCommand

Removed folders

  • utils/ — dispersed

Renamed

  • scene/System.hscene/systems/ISystem.h (class SystemISystem)
  • scene/Systems.{h,cpp}scene/systems/SystemRegistry.{h,cpp} (class SystemsSystemRegistry)
  • utils/LuthIcons.heditor/widgets/Icons.h
  • utils/ImGuiUtils.heditor/widgets/ImGuiUtils.h
  • utils/CustomFormatters.hcore/LogFormatters.h
  • utils/ImageUtils.cppresources/ImageUtils.cpp
  • renderer/FrameData.hcore/FrameData.h

Moved

  • 25 files from renderer/ top level into concept subfolders (sub-task G)
  • 7 event files from platform/ to events/ (sub-task A)

Key Design Decisions

SystemRegistry ownership fix

Previous Systems::AddSystem<T>() emplaced a unique_ptr<T> into a vector<shared_ptr<System>> — implicit conversion masked an ownership-model bug. Fixed:

  • Storage: vector<unique_ptr<ISystem>> (manager is sole owner)
  • GetSystem<T>() returns non-owning T* (was shared_ptr<T>)
  • All 5 panel members updated from shared_ptr<RenderingSystem> to raw RenderingSystem*

POD component field naming

struct ID { UUID Value; } chosen over struct ID { UUID ID; } to avoid struct-name/member-name shadow collision. Value applied uniformly across ID, Tag, Parent, Children (newtype wrapper convention). 14 caller files updated via .m_X.Value.

Components.h umbrella

Split into 6 granular headers (Common, Transform, Camera, Rendering, Lights, Animation) but kept Components.h as a 7-line umbrella #include. Existing #include "luth/scene/Components.h" callsites required no changes.

LightTypes.h extraction

DirectionalLightData, PointLightData, LightUniforms, k_ShadowCascadeCount, k_ShadowResolution moved from RenderingSystem.h (which includes Vulkan + scene + renderer headers) to a pure-data header with only core/LuthTypes.h + glm.hpp as dependencies. Shader reflection, tests, and future tools can include freely.


Lessons

Bulk text rewrites: use perl with BEGIN{binmode}, not sed.
sed -i on Git Bash for Windows silently strips CRLF → LF on every file it touches, even files with no match — produced 170+ bogus "modified" entries in git status during sub-task A. perl -i -pe 'BEGIN{binmode(ARGV);binmode(STDOUT);} s|...|...|g' reads/writes in binary mode and preserves CRLF byte-for-byte.

Ownership bugs hide behind implicit conversions. The unique_ptrshared_ptr container mismatch in Systems compiled cleanly because vector::emplace_back accepts anything convertible to the element type. Worth grepping for "raw pointer returned from smart container" patterns as a class of future bugs.


Build Verification

  • Debug x64 builds clean after every sub-task (8 incremental builds)
  • Premake regeneration clean (scripts\setup\setup_windows.bat equivalent)
  • No new warnings
  • Luth.lib + Luthien.exe artifacts produced

v1.5.0 — gtao

17 Apr 21:46

Choose a tag to compare

v1.5.0 — GTAO (Ground Truth Ambient Occlusion)

Version: v1.5.0 | Date: 2026-04-17 | Epic: #58 | Deps: compute-gpu-culling (v1.2.0)


What Was Built

Screen-space ambient occlusion via Ground Truth AO (Jimenez et al. 2016, XeGTAO inspiration). Replaces Luth's flat ao = 1.0 default ambient term with a physically-grounded occlusion signal that modulates the split-sum IBL contribution, dramatically improving the grounding of objects in scenes dominated by indirect light. Compute-only, mip-0 only (no LDS mip chain), no temporal accumulation yet — an MVP that slots into the existing render graph + compute pass infrastructure shipped in compute-gpu-culling.

Pipeline (per frame, after shadows, before forward shading):

Stage Input Output Notes
DepthPrepass (new, opaque-only forward) Indirect draws SceneDepth (D32) Enables GTAO to read depth before PBR shades; GeometryPass now loads depth with LESS_EQUAL
GTAODepthPrefilter (compute) SceneDepth GTAOLinearDepth (R32F, half-res) 2×2 min-gather + perspective linearize; sky pixels clamped to farZ
GTAOMain (compute) GTAOLinearDepth GTAORawAO (R8, half-res) Horizon-based integral, 2–8 slices, IGN jitter; VS normals reconstructed from depth derivatives
GTAODenoise (compute) GTAORawAO + GTAOLinearDepth GTAOFinal (R8, half-res) 3×3 tent + bilateral depth weight (~10% relative sigma)
GeometryPass (pbr.frag) GTAOFinal (Set 0 binding 4) SceneColor ambient *= gtaoAO — multiplies material occlusion if present
  • Z-prepass. Depth-only forward pass using the camera region of the existing indirect buffer; position-only vertex shader (rigid + skinned variants), reuses shadowDepth.frag as null fragment. GeometryPass switched to LOAD_OP_LOAD + VK_COMPARE_OP_LESS_OR_EQUAL so opaques pass on equal-z. Unblocks both GTAO and the future forward-plus (#54) cluster pipeline.
  • GTAOSettings. Runtime-tunable struct nested in PostProcessSettings: enabled / halfRes / visualize, intensity / radius / falloff / power, sliceCount (2/3/4/8) / stepsPerSlice. Editor section in RenderPanel with XeGTAO-recommended defaults (radius 0.5 m, falloff 0.615, power 2.0, 3 slices × 2 steps). Mirrored to GPU via a 48-byte std140 GTAOUBO, refreshed each frame in UpdateGTAOUBO().
  • Set 0 expansion. Two new bindings sampled by pbr.frag: binding 4 = sampler2D gtaoTex, binding 5 = GTAOUBO. Descriptor writes live in UpdateAODescriptors (called from InitAOResources and after Resize recreates the half-res textures).
  • Frame Debugger support. GTAOLinearDepth / GTAORawAO / GTAOFinal registered as tracked render targets. Added R8_Unorm and R32_Float to both RG::TextureFormat and FrameDebugger::ToVkFormat so archive images allocate at native format instead of silently falling through to RGBA8_UNORM (which previously caused rainbow-banding previews for both GTAO buffers).
  • Visualize mode. gtao.visualize toggles the PBR shader to output the raw GTAO buffer as the scene color — isolates AO contribution for tuning without writing a dedicated debug pass.
  • Always-on chain. GTAO runs every frame regardless of enabled; the shader's enabled flag gates the modulation inside pbr.frag. This avoids first-frame layout-transition ordering issues (the Set 0 binding-4 sampler always sees a valid SHADER_READ_ONLY_OPTIMAL layout).

Bugs Fixed Mid-Epic

  • Frame Debugger preview refresh required cascade-click round-trip. m_DepthPreviewKey (the Phase 14F depth-blit cache key) was never reset when a new capture began — same-archive re-selections after recapture skipped the blit and served stale texture. Matched m_PerDrawPreviewKey's invalidation at BeginCapture time.
  • Archive sink format-reinterpretation. FrameDebugger::ToVkFormat is a parallel copy of RenderGraph::ToVkFormat and was missing cases for the new GTAO formats, so vkCmdCopyImage between the source image and the RGBA8 fallback destination did a raw byte reinterpretation — visible as colored horizontal banding over both GTAOLinearDepth (R32_SFLOAT) and GTAORawAO (R8_UNORM) previews. Fixed by adding the missing cases to both maps + to RG::TextureFormat.

Files Added / Modified

New:

  • luth/assets/shaders/depthPrepass.vert + depthPrepass_skinned.vert — position-only Z-prepass vertex shaders
  • luth/assets/shaders/gtao_depth_prefilter.comp — half-res min-gather + linearize
  • luth/assets/shaders/gtao_main.comp — horizon-based AO integral
  • luth/assets/shaders/gtao_denoise.comp — 3×3 bilateral-depth denoise
  • luth/source/luth/renderer/GTAOSettings.hGTAOSettings + GTAOUBO (std140)
  • luth/source/luth/renderer/passes/DepthPrepass.cpp — camera-space Z-prepass (AddDepthPrepass)
  • luth/source/luth/renderer/passes/AOPass.cppAddGTAODepthPrefilterPass / AddGTAOMainPass / AddGTAODenoisePass
  • docs/development/epics/gtao.md (deleted at epic close — this file supersedes it)

Modified:

  • luth/assets/shaders/pbr.frag — Set 0 bindings 4/5; GTAO modulation in ambient term; viz early-out
  • luth/source/luth/scene/systems/RenderingSystem.{h,cpp} — Set 0 layout grows to 6 bindings; InitAOResources, UpdateAODescriptors, UpdateGTAOUBO; GTAO descriptor pool + sampler; frame-graph wiring (DepthPrepass → GTAO×3 → GeometryPass); tracked RTs; hot-reload rebuild of all three GTAO pipelines; m_DepthPreviewKey invalidation at capture start
  • luth/source/luth/renderer/passes/GeometryPass.cpp — receive SceneDepth handle, LOAD_OP_LOAD + LESS_EQUAL
  • luth/source/luth/renderer/{Texture.h,backend/vulkan/VulkanTexture.cpp}R32_Float format
  • luth/source/luth/renderer/rendergraph/{RenderGraphResources.h,RenderGraph.cpp,RenderResourceCache.cpp}R32_Float + R8_Unorm formats
  • luth/source/luth/renderer/FrameDebugger.cpp — archive format map gains R32_Float + R8_Unorm
  • luth/source/luth/renderer/PostProcessSettings.h — nested GTAOSettings gtao;
  • luth/source/luth/editor/panels/RenderPanel.cpp — "Ambient Occlusion (GTAO)" collapsing section
  • luth/source/luth/core/Version.h — bumped to v1.5.0

Out of Scope (Future Polish)

  • XeGTAO parity. Full 5-mip LDS depth pyramid; edges texture for anisotropic denoising; multi-bounce approximation for diffuse; selective specular attenuation.
  • Temporal accumulation. Reuses GTAO for free once the fxaa-taa epic (#72) lands motion vectors + history buffer.
  • Half-res / full-res toggle. UI field exists but has no effect yet — always half-res. Trivial to wire once a use case demands it.
  • PostProcessSettings serialization. GTAO settings reset to XeGTAO defaults per session, matching the existing bloom/tonemap fields. If editor persistence is wanted, extend EditorSettings to mirror the fields.
  • AO-aware specular. Currently multiplies diffuseIBL + specularIBL equally; XeGTAO weights specular with a separate cone-trace term derived from horizons.

v1.4.0 — Frame Debugger Sync Rework

17 Apr 12:45

Choose a tag to compare

Phase 14 — Frame Debugger Sync Rework

Version: v1.4.0 | Date: 2026-04-17 | Epic: #74 | Supersedes: #31


What Was Built

Reworked the Frame Debugger into a Unity-grade, GPU-true debugging tool. The old live-replay model — re-executing the pipeline up to N draws using the current uniforms/cull state, not the captured ones — was the root of a chronic sync bug where the displayed image never matched the selected step. Phase 14 deletes that path entirely and replaces it with archived per-pass images + on-demand per-draw replay.

  • Archive sink (IArchiveSink + ArchivedImage) — RG::RenderGraph::Execute invokes a sink hook after every non-culled pass; the FrameDebugger sink emits vkCmdCopyImage for each tracked render target into a fresh, persistent staging image, restoring the source layout so the RG's compile-time barrier solver stays consistent. Tracked RTs in v1: SceneColor, SceneDepth, ShadowMap.C0..C3 (one per cascade — ShadowPass imports per-layer views with names suffixed by cascade index), LDROutput, EntityID, BloomAFinal (~10 archives, ~50–100 MB at 1080p).
  • Frozen-state model — strict snapshot. The Frozen branch in RenderingSystem::Update does NOT rebuild or re-execute the live graph; m_LDROutput retains its captured contents and the editor's ScenePanel keeps showing the GPU-true frame. Each Frozen tick bit-compares the current viewProj against captureViewProj — a mismatch flips the state machine back to CaptureRequested and the next frame runs a fresh capture (Unity behavior: frozen on the captured image, auto-refresh on camera move).
  • Hierarchical EventNode tree — replaces the flat pass→draw list with Group / Pass / Cascade / Draw kinds. An explicit prefix registry (FrustumCull. → "Frustum Culling", ShadowPass.C<N> → "Shadows" with cascade children) keeps grouping deterministic. Built once at FinalizeCapture and stored on CapturedFrame::rootEvent.
  • Per-draw replay-then-copy — clicking a GeometryPass draw triggers an ImmediateSubmit that re-records the pass up to draw N into m_SceneColor and copies the result into a persistent RGBA16F preview the panel samples through ImGui. The live UBOs/SSBOs/indirect buffer are byte-stable in Frozen state (no live writers between captures once AnimationSystem is paused — see bug fix below), so no separate frozen-buffer plumbing is required. Cache is keyed by (passIdx, localDrawIdx) and invalidated on every BeginCapture / ExitCapture.
  • CSM cascade UI — Cascade nodes in the tree map to per-cascade single-layer depth archives. BlitArchivedDepthToPreview linearizes the selected cascade through the existing depth blit pipeline into an RGBA8 preview, using [prev_split..this_split] for sensible per-cascade contrast. Detail panel surfaces capture-time cascadeSplitsViewZ, shadowBias, shadowNormalBias, cascadeTexelSize, and the full lightSpaceMatrix[i] — values stamped from m_Cached* at finalize so editing light parameters while frozen doesn't desync the readout.
  • Lifetime safety — archive teardown deferred via VulkanContext::PushDeletion so in-flight ImGui frames sampling archive views can complete before the views/images are freed. Panel descriptor caches keyed by VkImageView pointer (not archive index) so recaptures with overlapping indices always trigger fresh ImGui_ImplVulkan_AddTexture calls. All archive frees route through VulkanAllocator::FreeImage to keep the editor's MemoryTracker GPU counter in sync with VMA.

Bugs Fixed Mid-Phase

  • Cache key collision across captures — same (passIdx, drawIdx) after a recapture meant the second click on the same draw was a cache hit and the panel served stale preview content. Fixed by invalidating m_PerDrawPreviewKey on every BeginCapture and ExitCapture.
  • MemoryTracker drift — deferred destroy lambda originally called vmaDestroyImage directly, bypassing VulkanAllocator::FreeImage (which is what fires MemoryTracker::RecordFree). VMA freed the GPU memory but the editor's GPU counter only ever saw allocations. Fixed by routing all archive + per-draw + depth-preview destruction through VulkanAllocator::FreeImage.
  • Click handler swallowed by right-aligned annotationImGui::IsItemClicked was called after the SameLine + TextDisabled annotation, so it queried the disabled label (unclickable) instead of the tree node. Fixed by capturing clickedThisNode immediately after TreeNodeEx.
  • Cascade output "no preview" — initial tracked-RT set registered "ShadowMap" but ShadowPass imports per-cascade resources named "ShadowMap.C<i>", so the sink filtered every cascade write out and cascade nodes had archivedImageIndex = -1. Fixed by registering all four ShadowMap.C0..C3 names. BlitArchivedDepthToPreview also needed a fallback to archive.view when archive.layers <= 1 because each cascade archive is single-layer (the source is a per-layer view onto the shared 4-layer image), but the EventNode's archiveLayer carries the cascade index 0..3 for detail-panel lookups.
  • Animated meshes drifting between draw replaysAnimationSystem::Update ticked every frame regardless of debugger state, so BoneMatrixBuffer's contents changed between consecutive per-draw replays and each draw rendered a different pose. Fixed by early-returning from AnimationSystem::Update when RenderingSystem::GetDebuggerState() == Frozen. Mirrors Unity's pause-while-inspecting behavior; will fold into a scene-level pause flag once Phase 16 (physics) and Phase 15 (play mode) land.
  • Timings vanish when scene gains its first modelm_GPUTimers.Init(16) was below the live frame's non-culled pass count. With no models, ShadowPass.C0..C3 are dead-pass-culled, total ≤16. Adding one model un-culls them, total >16, and GPUTimerPool::ReadResults early-returns -1 for every slot when passCount > maxPasses. Bumped capacity to 64 (current frame ≈19 passes; headroom for GTAO etc.).

Files Modified / Added

New:

  • luth/source/luth/renderer/rendergraph/ArchivedImage.{h,cpp} — staging-image RAII + lazy per-layer view cache
  • luth/source/luth/renderer/rendergraph/IArchiveSink.h — RG post-pass hook interface
  • luth/source/luth/renderer/rendergraph/FrameEventTree.{h,cpp} — hierarchical event model

Modified:

  • luth/source/luth/renderer/rendergraph/RenderGraph.{h,cpp}SetArchiveSink + post-pass invocation
  • luth/source/luth/renderer/rendergraph/FrameCapture.harchivedImages, passArchives, captureViewProj, rootEvent, cascade cache (splits/bias/texel/light-space matrices)
  • luth/source/luth/renderer/FrameDebugger.{h,cpp}IArchiveSink impl, archive lifecycle, deferred teardown
  • luth/source/luth/scene/systems/RenderingSystem.{h,cpp} — gut RenderCapturedFrame (~350 LoC), rewrite Frozen branch, FinalizeCapture, ReplayPassUpToDraw, BlitArchivedDepthToPreview, EnsurePerDrawPreviewTexture, EnsureDepthPreviewTexture, GPU timer pool bumped to 64
  • luth/source/luth/scene/systems/AnimationSystem.cpp — early-return when Frame Debugger is Frozen
  • luth/source/luth/editor/panels/FrameDebuggerPanel.{h,cpp} — recursive DrawEventNode, archive / per-draw / depth preview paths, cascade detail block
  • luth/source/luth/core/Version.h — bumped to v1.4.0

Out of Scope (Future Polish)

  • Per-draw replay for non-GeometryPass passes (ShadowPass.C<i>, fullscreen passes). Today the panel falls back to the pass-output archive for those.
  • 3D viewport overlay of cascade frustum slices.
  • 4-thumbnail strip for A/B cascade comparison (cascade detail panel currently shows one cascade at a time).
  • HDR tonemapping for the per-draw preview (raw RGBA16F surfaces, clipping above 1.0 is annotated in the panel).
  • Scene-level pause flag — replace the AnimationSystemRenderingSystem direct query with a single Scene::IsPaused() flag once PhysicsSystem / PlayMode land (Phases 15–16).

v1.3.0 — Cascaded Shadow Maps

16 Apr 22:02

Choose a tag to compare

Phase 13 — Cascaded Shadow Maps

Version: v1.3.0 | Date: 2026-04-16 | Epic: #60


What Was Built

Replaced the single 2048² directional-light shadow map with a 4-cascade PSSM system:

  • 4-layer shadow arrayVK_IMAGE_VIEW_TYPE_2D_ARRAY D32 texture (2048×2048×4); per-layer VkImageView for ShadowPass writes; full-array view for PBR sampling.
  • CSM uniform plumbingGlobalUniforms extended with lightSpaceMatrix[4], cascadeSplitsViewZ, shadowBias, shadowNormalBias, cascadeTexelSize, cascadeBlendWidth, debugVisualizeCascades (std140, 544 B).
  • PSSM splits — Engel practical formula (splitLambda = 0.5 default) converted to view-space Z.
  • Per-cascade ortho fitting — Sascha Willems bounding-sphere approach: centroid of 8 sub-frustum corners, radius = ceil(r * 16) / 16, glm::lookAt + symmetric glm::ortho(-r, r, -r, r, 0, 2r). Rotation-invariant and shimmer-resistant.
  • ShadowPass multi-layer — 4 × AddShadowPass calls, each with per-layer view and cascadeIndex push constant.
  • Per-cascade GPU culling — Indirect buffer region per cascade; 5 cull dispatches (camera + 4 cascades); frustum planes extracted via Gribb-Hartmann from each lightSpaceMatrix[i].
  • PBR cascade selectionviewZ → primary cascade; inside-test fall-through loop for robustness; 3×3 PCF via sampler2DArrayShadow; cascade blend at transition zone; per-cascade depth + normal bias scaled by cascadeTexelSize.
  • Debug vizDebugVisualizeCascades flag tints fragments by cascade index (red/green/blue/yellow).

Known Issue — Coverage Gaps

A light-direction-dependent coverage bug persists: large ground-plane regions fail the ProjectInCascade inside-test (specifically proj.z < 0), appearing unlit. The bug is not in the cascade-fit math — Sascha's verbatim reference implementation also produces the symptom. Cascade-tint visualization confirms the fit geometry is correct; the failure is in shadow-map sampling.

Prioritized suspects for next session:

  1. UBO round-trip integrity — verify shader receives the correct lightSpaceMatrix values (sentinel matrix test).
  2. Shadow pass cullMode = VK_CULL_MODE_FRONT_BIT — Sascha's reference uses back-face cull; mismatch may corrupt depth writes.
  3. Gribb-Hartmann near plane uses GL clip convention (row3 + row2) rather than Vulkan (row3); too permissive for culling but worth fixing.

Recommended tooling: Frame Debugger needs to be updated before tackling this (it was unavailable during Phase 13 debugging and would have resolved the issue quickly).

Files Modified

  • luth/source/luth/scene/systems/RenderingSystem.{h,cpp} — GlobalUniforms, shadow resources, cascade math, per-cascade cull dispatches
  • luth/source/luth/renderer/passes/ShadowPass.cpp — per-layer view, cascadeIndex push constant
  • luth/source/luth/renderer/passes/CullPass.{h,cpp} — destOffset arg, 5 named dispatches
  • luth/source/luth/renderer/passes/GeometryPass.cpp — all-layer barrier before geometry
  • luth/source/luth/renderer/backend/vulkan/VulkanTexture.{h,cpp}CreateLayerView
  • luth/source/luth/scene/Components.hDirectionalLight CSM fields
  • luth/source/luth/scene/SceneSerializer.cpp — persist CSM fields
  • luth/source/luth/editor/panels/InspectorPanel.cpp — basic CSM inspector controls
  • luth/assets/shaders/pbr.frag — cascade selection, blending, PCF, debug tint
  • luth/assets/shaders/shadowDepth.vertcascadeIndex push constant
  • luth/assets/shaders/shadowDepth_skinned.vert — same
  • luth/assets/shaders/gpu_cull.compdestOffset push constant
  • luth/source/luth/core/Version.h — bumped to v1.3.0

v1.2.0 — Compute + GPU Culling

15 Apr 18:26

Choose a tag to compare

Highlights

GPU-driven rendering infrastructure: compute pass support in the render graph, GPU frustum culling via compute shader + indirect draw, and migration of every graphics pass off vkCmdDrawIndexed to vkCmdDrawIndexedIndirect. This is the keystone phase that unlocks GTAO, Forward+, HZB occlusion, and GPU particles in later phases. (Epic #55)

What's Included

  • Render graph compute + buffer infrastructureBufferDesc/BufferHandle/BufferBarrier, 5 new ResourceState values (ComputeRead/Write, StorageBufferRead/Write, IndirectRead), AddComputePass<Data>(), VkBufferMemoryBarrier2 emission, VMA-backed pooled storage buffers.
  • VKComputePipeline wrapper — clean abstraction over vkCreateComputePipelines with pipeline layout + push constants, used for both GPU cull and IBL precompute.
  • GPU frustum cull compute passgpu_cull.comp (256 invocations/group) tests bounding spheres against 6 frustum planes; sets instanceCount=0 on culled indirect commands (no GPU-side compaction — hardware skips zero-count draws).
  • GPUObjectData SSBO (Set 5) — 112-byte std430 struct per object. All vertex shaders read via objects[gl_BaseInstance]; push constants removed from geometry, shadow, PBR, and skinned variants.
  • Indirect draw conversionGeometryPass and ShadowPass both use vkCmdDrawIndexedIndirect, grouped by (VB, IB, pipeline). Shadow reuses the main-camera cull results (per-cascade culling deferred to Phase 13).
  • IBLPrecompute refactor — ad-hoc vkCreateComputePipelines replaced with persistent VKComputePipeline instances.
  • Frame Debugger extensionsDispatchKind enum, CaptureIndirectDraw(), CaptureComputeDispatch(), [C]/[I] prefixes in the panel tree, compute/indirect metadata in detail view.

Descriptor Sets (final after Phase 12)

Set Content
0 GlobalUniforms + shadowMap + IBL
1 Bindless textures
2 Material SSBO
3 Light UBO
4 Bone matrices SSBO
5 GPUObjectData SSBO (new)

Full Changelog

v1.1.1...v1.2.0