Releases: Hekbas/Luth
v2.3.0 — core-reorg
v2.3.0 — core-reorg
Reorganized luth/source/luth/core/ from a flat 20-file folder into three semantic sub-folders. LuthTypes.h split by purpose; Math.h renamed and merged with the GLM aliases. Pure file-organization refactor — zero runtime change.
Layout
luth/source/luth/core/
├── App, EntryPoint, Version, FrameData, UUID, # top-level lifecycle
│ EditorHooks, ProjectFile # (stays at root)
├── types/
│ ├── LuthTypes.h primitives only (i8…u64, f32/f64, byte, fs)
│ ├── LuthMath.h Vec/Mat/Quat aliases + Math:: facade + Assimp/AABB/Frustum
│ └── TypeTraits.h IsGLMVector / IsGLMMatrix
├── diagnostics/
│ ├── Log.{h,cpp}
│ ├── LogFormatters.h fmt + ostream<< Vec3/Mat4 (moved from LuthTypes.h)
│ └── Profiler.h
└── time/
├── Time.h
└── Timer.h
Migration
- 132 caller files rewritten via
perl -pi. LuthTypes.hconsumers split by Vec/Mat/Quat usage: 18 →LuthMath.h, 48 →LuthTypes.h(primitives-only).- ostream<< Vec3/Mat4 operators relocated from
LuthTypes.htoLogFormatters.h(logging concern). - Dead
Luth::Normalize/Crossforward decls deleted (no definitions, no callers — superseded byMath::*in v2.2.0).
Both Debug + Release x64 build clean.
Issue: #81
Commits: 3 on epic/core-reorg
v2.2.0 — math-abstraction
v2.2.0 — math-abstraction
Date: 2026-04-18
Commits: 8 (on epic/math-abstraction)
Issue: #80
Overview
Second epic of the post-v2.0 architecture-review series. Built Luth::Math as a single-source facade for every glm type and function the engine uses. After this epic, only LuthTypes.h and Math.h include <glm/...>; engine and editor code use Vec*/Mat*/Quat aliases and Math::* wrappers exclusively.
Minor version bump to v2.2.0 per the ROADMAP MINOR rule (one completed epic with engineering-visible changes — single math include, modern C++20 templated constants, type-aware tolerances, latent numeric_limits::min vs lowest confusion fixed in AABB).
Sub-Tasks
| # | Sub-task | Commit |
|---|---|---|
| — | Spec scaffold | 8ed0300 docs(epic): add math-abstraction spec |
| A | Reshape LuthTypes.h primitive layer | eaeaa00 refactor(core): reshape LuthTypes.h primitive layer |
| B | Build Luth::Math facade | 7d91417 feat(core): build Luth::Math facade with constants and function wrappers |
| C | Pass A engine — types | 9c6f2b9 refactor(luth): migrate engine to Vec*/Mat* aliases (Pass A) |
| D | Pass A editor — types | 430beca refactor(luthien): migrate editor to Vec*/Mat* aliases (Pass A) |
| E | Pass B engine — functions | fc55bd8 refactor(luth): route engine math through Luth::Math facade (Pass B) |
| F | Pass B editor — functions | c9188f3 refactor(luthien): route editor math through Luth::Math facade (Pass B) |
| G | Purge direct glm includes | 1bef56b chore(core): purge direct glm includes engine-wide |
| H | Docs + v2.2.0 + history + wrap-up | chore(release): math-abstraction → v2.2.0 |
Type Aliases (LuthTypes.h)
Before
using Vec2 = glm::vec2; using Vec3 = glm::vec3; using Vec4 = glm::vec4;
using Mat3 = glm::mat3; using Mat4 = glm::mat4;
using Quat = glm::quat;
constexpr f32 PI = 3.14159265358979323846f;
constexpr f32 TWO_PI = 2.0f * PI;
constexpr f32 HALF_PI = 0.5f * PI;
constexpr f32 EPSILON = std::numeric_limits<f32>::epsilon();
constexpr f32 FLOAT_MAX = std::numeric_limits<f32>::max();
constexpr f32 FLOAT_MIN = std::numeric_limits<f32>::min(); // smallest *positive*, almost never what callers wantAfter
using Vec2 = glm::vec2; using Vec3 = glm::vec3; using Vec4 = glm::vec4;
using IVec2 = glm::ivec2; using IVec3 = glm::ivec3; using IVec4 = glm::ivec4;
using UVec2 = glm::uvec2; using UVec3 = glm::uvec3; using UVec4 = glm::uvec4;
using Mat2 = glm::mat2; using Mat3 = glm::mat3; using Mat4 = glm::mat4;
using Quat = glm::quat;
// Constants block deleted — moved to Luth::Math (templated)The 6 root-namespace constants had zero call sites when verified — dead code. Reborn under Math:: as templated inline constexpr variables. The EPSILON = numeric_limits::epsilon() was a particularly misleading name (machine epsilon ≈ 1.19e-7 vs. spatial tolerance ~1e-4). Replaced by named tolerances (SmallNumber, KindaSmallNumber) and explicit MachineEpsilon for the rare ULP-level case.
Math Facade (Math.h)
Constants
namespace Luth::Math {
// Standard — delegate to <numbers>
template<class T> inline constexpr T Pi = std::numbers::pi_v<T>;
template<class T> inline constexpr T E = std::numbers::e_v<T>;
template<class T> inline constexpr T Sqrt2 = std::numbers::sqrt2_v<T>;
// Derived
template<class T> inline constexpr T TwoPi = T(2) * Pi<T>;
template<class T> inline constexpr T HalfPi = Pi<T> / T(2);
template<class T> inline constexpr T QuarterPi = Pi<T> / T(4);
template<class T> inline constexpr T InvPi = T(1) / Pi<T>;
template<class T> inline constexpr T DegToRad = Pi<T> / T(180);
template<class T> inline constexpr T RadToDeg = T(180) / Pi<T>;
// Tolerances (engine value-add)
template<class T> inline constexpr T SmallNumber = T(1e-8);
template<class T> inline constexpr T KindaSmallNumber = T(1e-4);
// Sentinels — delegate to <limits>
template<class T> inline constexpr T FloatMax = std::numeric_limits<T>::max();
template<class T> inline constexpr T FloatLowest = std::numeric_limits<T>::lowest();
template<class T> inline constexpr T FloatMin = std::numeric_limits<T>::min();
template<class T> inline constexpr T MachineEpsilon = std::numeric_limits<T>::epsilon();
}Function wrappers
25 wrappers covering every glm::* function the engine uses:
| Group | Functions |
|---|---|
| Transformation | Translate, Rotate, Scale, Perspective, Ortho (×2), LookAt |
| Linear algebra | Inverse, Transpose, Normalize, Length, Length2, Dot, Cross |
| Interpolation | Mix, Slerp |
| Quaternion | ToMat4 (covers both glm::toMat4 and glm::mat4_cast), EulerAngles, QuatLookAt, Decompose |
| Angle conversion | Radians, Degrees |
| Common math | Clamp, Min, Max, Abs (function templates — work on scalar + vector) |
| Pointer helpers | ValuePtr (const + non-const), MakeVec3 |
| Type re-exports | length_t, qualifier |
AABB sentinel fix
// Before
struct AABB {
Vec3 Min = Vec3( std::numeric_limits<float>::max());
Vec3 Max = Vec3(-std::numeric_limits<float>::max()); // works but not by name
...
};
// After
struct AABB {
Vec3 Min = Vec3(Math::FloatMax<f32>);
Vec3 Max = Vec3(Math::FloatLowest<f32>); // explicit lowest()
void Expand(const Vec3& p) { Min = Math::Min(Min, p); Max = Math::Max(Max, p); }
...
};Both old and new code work, but the new spelling makes the intent (lowest(), not min()) impossible to misread.
Migration
Pass A — types (commits C, D)
glm::vec2/3/4, glm::ivec2/3/4, glm::uvec2/3/4, glm::mat2/3/4, glm::quat → Vec*/IVec*/UVec*/Mat*/Quat aliases. Word-boundaried perl in-place across 35 files (28 engine + 7 editor); LuthTypes.h excluded so the IsGLMVector<glm::vec<L,T,Q>> and IsGLMMatrix<glm::mat<C,R,T,Q>> trait specializations stay valid (their job is to detect glm types).
Pass B — functions (commits E, F)
glm::translate/rotate/scale/... → Math::Translate/Rotate/Scale/.... Same files, same tool. Special cases:
glm::two_pi<f32>()→Math::TwoPi<f32>(variable template, no parens)glm::mat4_cast(quat)andglm::toMat4(quat)→ unified asMath::ToMat4glm::length_t,glm::qualifier→Math::length_t,Math::qualifier(template-parameter usages inEditorCamera)glm::length2(unscoped from initial inventory) →Math::Length2— surfaced during commit E, facade extended in same commitglm::decompose(full TRS+skew+perspective unpack used byFrameDebuggerPanel) →Math::Decompose— facade extended in commit E for the editor consumer
Purge — includes (commit G)
#include <glm/...> removed from all 33 non-facade files. Math.h and LuthTypes.h are now the only files including glm headers. PCH carries the facade into every translation unit transparently — no consumer needs an explicit Math.h include.
LuthTypes.h operator<< formatter signatures cleaned up: const glm::vec3& → const Vec3& (typedef-equivalent, just clearer style).
Final Tally
| Metric | Before | After | Delta |
|---|---|---|---|
Files with glm:: references |
37 | 2 (LuthTypes.h, Math.h — the facade) |
−35 |
Total glm:: references |
~450 | 50 (all inside the facade) | −400 |
Files with <glm/...> includes |
35 | 2 (the facade) | −33 |
Total <glm/...> include lines |
~38 | 10 (facade only) | −28 |
Key Design Decisions
Single math file vs split
Aliases stay in LuthTypes.h; functions and constants live in Math.h. PCH (luthpch.h) already pulls both, so callers get the full math layer transparently. The two-file split is conceptual (primitives vs operations) and matches the eventual E4 reorg where both move to core/types/. Doing it this way keeps E3 a clean enforcement pass — no folder moves, no file renames; E4 will move both files in one shot.
Templated inline constexpr over function form
Modern C++20 idiom (std::numbers::pi_v<T> is a variable template). Same expressiveness as the function form (glm::pi<T>()) without the trailing parens noise. Math::Pi<f32> reads cleanly at the call site and is consteval-equivalent for compile-time use.
Engine value-add: named tolerances
Math::SmallNumber<T> (1e-8) and Math::KindaSmallNumber<T> (1e-4) borrow Unreal's distinction. Spatial code rarely wants numeric_limits::epsilon() (~1.19e-7); it wants a robust tolerance for "essentially zero" or "vectors are equal enough." Naming the two cases makes intent explicit and lets future code change the values without a hunt-and-replace.
Header-only facade
All wrappers are inline one-line forwards to glm::*. No Math.cpp for E3. The compiler inlines them away (PCH-cached inline calls don't show up in the binary). E4 may revisit if folder relocation warrants .cpp files for non-inline helpers (ComposeTransform, Frustum).
Bulk-rewrite split per source tree
Pass A and Pass B each split into separate luth/luthien commits (rather than one combined Pass A or Pass B). Lets git bisect isolate which side of the engine→editor boundary introduced any regression. Adds one extra build per pass — cheap insurance.
Trait specializations stay glm-typed
IsGLMVector<glm::vec<L,T,Q>> and IsGLMMatrix<glm::mat<C,R,T,Q>> in LuthTypes.h keep glm::vec/glm::mat because detecting glm types is their purpose. Renaming the specialization body would break the abstraction at its foundation. Pass A explicitly excluded LuthTypes.h.
Build Verification
- 8 commits on `epic/math-...
v2.1.0 — shader-asset-pipeline
v2.1.0 — shader-asset-pipeline
Date: 2026-04-18
Commits: 5 (on epic/shader-asset-pipeline)
Issue: #79
Overview
First epic of the post-v2.0 architecture-review series. Rewrote the shader asset pipeline around single-stage shader assets: each .vert, .frag, or .comp file on disk is one asset with one UUID and one SPIR-V artifact. No more .vert+.frag pairing assumption in the importer, no more runtime ShaderCompiler::Compile fallback in the renderer, no more Fragment shader not found errors on startup.
Minor version bump to v2.1.0 per the ROADMAP MINOR rule (one completed epic with user-visible changes — gone-startup error, faster launches on second run via cached SPIR-V for all 24 engine shaders).
Sub-Tasks
| # | Sub-task | Commit |
|---|---|---|
| ABC | Single-stage shader asset model (schema + importer + Shader class) | 51d796a refactor(assets): single-stage shader assets (schema, importer, Shader class) |
| D | RenderPipeline + IBLPrecompute migration through ShaderLibrary::LoadEngine |
77f2c2f refactor(renderer): load all engine shaders through asset pipeline |
| E | Hot-reload for any stage; remove dead RecompileUtilityShaders fallback |
de34495 refactor(renderer): hot-reload any shader stage, remove utility-recompile fallback |
| — | Register .meta files for compute shaders (generated on first run) |
d7ab811 chore(assets): register .meta files for compute shaders |
| F | Docs + v2.1.0 + history + wrap-up | chore(release): shader-asset-pipeline → v2.1.0 |
(ABC bundled because the schema / importer / Shader-class reshape cannot be split build-clean — see "Sub-task granularity" in the epic spec.)
Schema Change
Before (v1 shader artifact)
struct ShaderAssetData { std::vector<u32> VertexSpirV; std::vector<u32> FragmentSpirV; std::string SourcePath; };
struct ShaderHeader { u32 VertexSpirVSize; u32 FragmentSpirVSize; };
// Artifact: [AssetHeader(v=1)][ShaderHeader{VertexSize,FragmentSize}][VertSpirV][FragSpirV]ShaderImporter::Import(.vert) compiled both .vert + its paired .frag. .frag source files had no artifact of their own (importer returned true without writing). Vertex-only / fragment-only / compute shaders could not be assets — they were runtime-compiled inline in RenderPipeline::Initialize / RecompileUtilityShaders / InitCullPipeline / InitAOResources / InitDebugBlitResources / IBLPrecompute (≈45 call sites, no caching, every launch).
After (v2 shader artifact)
enum class ShaderStage : u32 { Unknown, Vertex, Fragment, Compute };
struct ShaderAssetData { ShaderStage Stage; std::vector<u32> SpirV; std::string SourcePath; };
struct ShaderHeader { u32 Stage; u32 SpirVSize; };
// Artifact: [AssetHeader(v=2)][ShaderHeader{Stage,SpirVSize}][SpirV]DeserializeShader rejects version != 2 so V1 artifacts force a re-import under the new schema on first launch after this lands.
Shader / VulkanShader Contract
// Shader (was multi-stage container; now single-stage)
class Shader : public Asset {
virtual ShaderStage GetStage() const = 0;
virtual const std::vector<u32>& GetSpirV() const = 0;
virtual const fs::path& GetPath() const = 0;
virtual bool IsValid() const = 0;
virtual void Reload() = 0;
static std::shared_ptr<Shader> Create(ShaderStage, const std::vector<u32>& spirv, const fs::path&);
};
// VulkanShader stores one VkShaderModule + one VkPipelineShaderStageCreateInfo.
// Removed: CompileOrGetVulkanBinaries pairing heuristics.
// Removed: old Create(vertSpv, fragSpv, path) / VulkanShader(vertSpv, fragSpv, path) overloads.Pipeline construction in RenderPipeline::CreatePipelines is unchanged — it still consumes raw m_*Spv blobs. What changed is where those blobs come from: previously runtime ShaderCompiler::Compile, now ShaderLibrary::LoadEngine("shaders/x.ext")->GetSpirV() on the shared asset.
ShaderLibrary Changes
- New
ShaderLibrary::LoadEngine(engineRelPath)— idempotent loader + registrar. Keys library entries by filename ("pbr.vert","gtao_main.comp"). Internally:AssetDatabase::GetUUID→AssetManager::LoadImmediate→Register. - Keys migrated from friendly names (
"pbr","shadowDepth") to per-stage filenames ("pbr.vert","pbr.frag", ...). - Reload callback now handles all 24 engine shader filenames (graphics + compute): pulls fresh SPIR-V into the cached
m_*Spvblob and rebuilds the affected pipeline(s). Compute-pipeline rebuild is inline (push-constant sizes duplicated fromInitCullPipeline/InitAOResources; acceptable for now). - File watcher filter extended from
.vert|.fragto.vert|.frag|.comp. Unmatched files now log a warning; the oldm_PendingUtilityReloadfallback flag +RenderPipeline::RecompileUtilityShaders()dead-path were deleted.
Scanner Change
FileSystem::GetAssetTypeFromPath now recognizes .comp. On first launch after this lands, AssetDatabase::InitEngine discovers all 8 compute shaders under luth/assets/shaders/ and generates .meta files with stable UUIDs (committed in d7ab811). Same dirty-tracking + artifact-cache path as .vert/.frag.
Call-site Migrations
Replaced every ShaderCompiler::Compile(shadersPath / "x.ext") in:
| File | Calls |
|---|---|
renderer/RenderPipeline.cpp (Initialize, InitCullPipeline, InitAOResources, InitDebugBlitResources) |
20 |
renderer/lighting/IBLPrecompute.cpp (equirect / irradiance / prefilter / brdf_lut / skybox) |
6 |
ShaderCompiler::Compile is now only invoked by (a) ShaderImporter::Import (on asset import / hot-reload artifact refresh), and (b) VulkanShader::Reload (in-memory recompile of the one stage the shader owns).
Also dropped dead #include "luth/renderer/shader/ShaderCompiler.h" from 10 files (9 passes + RenderingSystem.cpp) that no longer call it.
Key Design Decisions
Single-stage asset = disk file
One .vert file = one asset = one UUID = one artifact. Pipelines combine stages at creation time, not at import time. This removes the implicit "vert's friend is a frag with the same stem" coupling that broke any vertex-only / compute / fragment-only shader.
V2 rejects V1 instead of silent upgrade
DeserializeShader checks header.Version != 2 and returns false, which triggers a re-import on first load. No silent format conversion, no zombie V1 data on disk. Cleaner than reading both schemas and trying to pick the right one.
PipelineManager keyed by vertex-shader UUID
GetOrCreate(shaderUUID, renderMode, cullMode, polyMode, vertSpv, fragSpv) still uses a single UUID as the cache key. Chose ShaderLibrary::Get("pbr.vert")->Handle as the canonical key for the PBR pipeline family (vs. introducing a synthetic program-UUID or hashing both). Stable across launches, consistent with the old behavior (the old "pbr" library entry WAS pbr.vert). The reload callback invalidates with the canonical key regardless of which stage edited — so a pbr.frag reload correctly drops cached PBR pipelines.
ABC bundled into one commit
Schema rewrite (A), importer rewrite (B), and Shader/VulkanShader reshape (C) are interlocked: changing one in isolation breaks the build. The epic's issue-level checklist tracks all three, but the commit is one atomic refactor. D, E, F land as separate commits since they're each build-clean on their own.
RecompileUtilityShaders deleted (no deprecation window)
After D, every engine shader is in ShaderLibrary, so the file-watcher always finds a library match and the m_PendingUtilityReload fallback never fires. Deleted along with m_PendingUtilityReload and the drain branch in RenderingSystem::Update rather than leaving dead code. Per-epic principle: no backwards-compat shims.
Compute pipeline rebuild inline in the callback
Hot-reload of a .comp shader rebuilds the matching VKComputePipeline inline in the ShaderLibrary reload callback (push-constant sizes + descriptor-layout handle copied from the Init* call sites). Could factor out a RebuildComputePipeline(name) helper — left inline for this epic; clean-up candidate for render-pipeline-split (E6 in the review plan).
Build Verification
- 5 work commits on
epic/shader-asset-pipeline; every commit builds Debug x64 clean (MSBuild/v:minimalreports zero errors; only pre-existing warnings — C4267/C4244/C4996/LNK4006). - Full 3-project solution (
Luth,Luthien,Runtime) builds unchanged. grep ShaderCompiler::Compile— only 2 legitimate call sites remain (ShaderImporter::Import,VulkanShader::Reload).grep "VertexSpirV\|FragmentSpirV"— zero in source (only in history docs).
Runtime Verification (user smoke test)
- Delete
luth/Library/Artifacts/→ relaunch: every.vert/.frag/.compimports once, noFragment shader not founderror. - Second launch: no SPIR-V recompilation (artifact mtimes unchanged).
- Full render: PBR + shadows + GTAO + bloom + skybox + outline + grid + ImGui visually identical.
- Frame Debugger captures + replays a frame.
- Skinned mesh renders with shadows.
- Edit
depthPrepass.vert→ hot-reload fires → depth pass rebuilds. - Edit
gtao_main.comp→ compute pipeline rebuilds; AO still renders.
Lessons
Pairing assumptions leak into asset schemas. The root cause of the Fragment shader not found error wasn't the importer's check — it was that the importer had ever been modeled around graphics-pipeline topology in the first place. Fix at the asset-model level, not the importer error path.
Atomic commits don't always mean one sub-task per commit. The issue body split the refactor into A/B/C so the scope tracking stays granular on GitHub, bu...
v2.0.0 — arch-target-split
v2.0.0 — arch-target-split
Date: 2026-04-18
Commits: 5
Issue: #78
Overview
Phase 5 of the architecture refactor — the final arch-* epic. Extracted ~12 071 LOC of editor code from Luth.lib into a new Luthien.lib static library, renamed the luthien/ exe folder to runtime/, and broke the engine→editor include dependency via an IEditorHooks interface. After this epic, Luth.lib has zero luthien/... includes and the one-way-dependency invariant is enforced by a git grep check.
Major version bump to v2.0.0 per the ROADMAP versioning rule (fundamental architecture change).
See the multi-epic plan: docs/development/ARCH-REFACTOR-PLAN.md.
Sub-Tasks
| # | Sub-task | Commit |
|---|---|---|
| A | Rename luthien/ exe folder to runtime/ |
refactor(build): rename luthien exe folder to runtime |
| B | Extract editor into Luthien.lib + introduce IEditorHooks |
refactor(build): extract editor into Luthien.lib |
| C | Untrack editor-state files | chore(repo): untrack editor state files |
| D | Regen + docs + v2.0.0 + history + release | chore(build): finalize target split, bump v2.0.0 |
Directory Changes
New folders
luthien/source/luthien/— editor code (wasluth/source/luth/editor/). Mirrorsluth/source/luth/.
Renamed
luthien/exe folder →runtime/(git mv;LuthienApp.cpp,Luthien.rc,icons/,resource.hhistories preserved)luth/source/luth/editor/**→luthien/source/luthien/**(git mv; 30+ files + subtrees)
New files
luth/source/luth/core/EditorHooks.{h,cpp}—IEditorHooksinterface +EditorHooks::Register/Getluthien/premake5.lua— newLuthien.libstatic-lib projectluthien/source/lepch.{h,cpp}— editor PCH (includesluthpch.h+ ImGui + Vulkan)luthien/source/luthien/Bootstrap.h— declaresInstallLuthienEditorHooks()luthien/source/luthien/EditorHooks.cpp—LuthienEditorHooksimpl forwarding toEditor::*/ProjectLauncher::*/EditorSelection::*
Bulk rewrites
- 142
#include "luth/editor/..."→#include "luthien/..."across 45 files (perl +binmodefor CRLF preservation) - 26 editor
.cppfiles:#include "luthpch.h"→#include "lepch.h"(new editor PCH)
Target layout after the epic
| Target | Kind | Links | Contents |
|---|---|---|---|
Luth.lib |
StaticLib | — | Engine only; no editor/panel code |
Luthien.lib |
StaticLib | Luth | Editor: panels, inspectors, commands, style, widgets, hook impl |
Luthien.exe |
ConsoleApp | Luth, Luthien | Editor application (runtime/Runtime project, targetname Luthien) |
Untracked
runtime/editor_settings.json,runtime/imgui.ini,samples/editor_settings.json,samples/cache/pipeline.bin(git rm --cached+ new.gitignorepatternseditor_settings.jsonandsamples/cache/)
Key Design Decisions
IEditorHooks instead of a full API redesign
The structural split surfaced deep coupling: App.cpp drove the editor's per-frame lifecycle via 14 direct Editor::* / ProjectLauncher::* / EditorSelection::* call sites; Input.cpp queried Editor::WantCaptureKeyboard/Mouse; Luth.h (public umbrella) included Editor.h. A full API redesign (virtual App hooks, event bus, RHI layer) would have tripled scope.
Compromise: a minimal nullptr-safe IEditorHooks interface in luth/core/EditorHooks.{h,cpp} with 18 virtual methods covering the exact call-site set. LuthienEditorHooks (in Luthien.lib) forwards each call. Registration runs in runtime/LuthienApp.cpp::CreateApp before App::App() constructs, so the hook is live from the first EditorHooks::Get() call onward. A runtime-only host that skips linking Luthien.lib leaves the registry empty and every engine-side call nullptr-checks cleanly to a no-op.
EditorViewportState snapshot instead of per-getter dispatch
App::Run's per-frame block was building CameraParams from Editor::GetPanel<ScenePanel>()->GetEditorCamera().GetViewMatrix() + 6 more getters. Putting each behind a virtual call would cost 10+ dispatches per frame. Replaced with a single IEditorHooks::GetViewportState(EditorViewportState&) that fills a POD (view/proj/pos/near/far/IBL/selection) in one roundtrip. Engine builds CameraParams from it.
Sandbox.exe descoped
Issue #78 sub-task D requested a Sandbox.exe target. Earlier experiments had one and it was removed as clutter. The structural goal (prove Luth.lib ships without editor) is enforced more cheaply:
- Physical: after B,
Luth.lib'sfiles { "source/**" }glob excludesluthien/, so editor.objcannot link in - Invariant:
git grep -l 'luthien/\|Luthien' luth/source→ zero (gated in D)
A future player/standalone harness can be added when there's a real consumer.
Layout luthien/source/luthien/ (mirrors luth/source/luth/)
Alternative was editor/source/luthien/ from ARCH-REFACTOR-PLAN.md. Chose the flatter form for symmetry with the engine layout — one less nesting level, and the Luthien brand is visible at repo root.
Sub-task order reversed from issue #78
Issue ordered (A) extract editor, (C) rename luthien/→runtime/. Flipped to (A) rename first, (B) move editor in. The rename is low-risk (pure git mv + premake tweak); doing it first frees the luthien/ folder name, then the high-risk ~12k-LOC move lands in an empty target. Risk-staging improved.
VS project-name + binary-name split
New Luthien.lib wanted project "Luthien". Old Luthien.exe also had that project name. Collision resolved: exe project renamed to "Runtime" with targetname "Luthien" so the output binary stays Luthien.exe. startproject "Runtime". CI artifact paths in .github/workflows/build.yml updated bin/.../Luthien/ → bin/.../Runtime/.
Circular static-lib dependency avoided
Before the hooks interface, the compile-clean intermediate state had Luth.lib referencing Editor:: symbols and Luthien.lib referencing Luth:: symbols — a cyclic static-lib dependency that MSVC's multi-pass linker handles at exe link time. After IEditorHooks, the cycle is gone: Luth.lib has no unresolved editor symbols, Luthien.lib depends on Luth, Luthien.exe links both. Clean one-way.
imgui refs in luth/source are legitimate engine infrastructure
The initial spec's D-verify included git grep -l 'imgui' luth/source → zero matches. This check was based on a misunderstanding. The engine legitimately uses ImGui in its render pipeline:
renderer/passes/ImGuiPass.cpp— render-graph pass that composites ImGui draw datarenderer/RenderPipeline.cpp— wires the ImGui pass into the graphrenderer/FrameDebugger.cpp— capture-time UIrenderer/rendergraph/ArchivedImage.{h,cpp}— archive-image UIplatform/WinWindow.cpp— GLFW↔ImGui event bridgingscene/systems/RenderingSystem.cpp— adds the ImGui pass
These are engine-level ImGui integrations, not editor code. The correct engine-cleanness invariant is luthien/|Luthien grep returning zero, not ImGui absence.
Shrinkage / Measurement
Editor LOC moved: ~12 071 (from ARCH-REFACTOR-PLAN.md's pre-epic tally). New engine-side code: ~120 LOC (EditorHooks.h/.cpp + Bootstrap.h + LuthienEditorHooks.cpp impl).
Post-split binary sizes (Debug x64):
| Binary | Size | Notes |
|---|---|---|
Luth.lib |
294 MB | Engine only; debug symbols bloat it |
Luthien.lib |
141 MB | Editor + ImGui + ImGuizmo glue |
Luthien.exe |
18 MB | Runtime binary linking both libs |
Pre-epic Luth.lib baseline not captured — measurement deferred (would require a pre-epic rebuild).
Lessons
Editor-extraction surfaced engine→editor coupling. App.cpp had 14 direct editor calls driving the editor's per-frame lifecycle; Input.cpp queried ImGui-capture state; Luth.h pulled Editor.h. The physical file move alone left this at include level. Adding IEditorHooks + nullptr-safe dispatch broke it cleanly with minimal API surface. A virtual-method App redesign would have been 5× the work — the hook interface is the right scope for a structural epic.
Git rename threshold can miss small files. Command.h had 10 lines total, 5 of which were #include "luth/editor/commands/..." lines that the perl rewrite changed. That's a 50%+ edit by git's default similarity metric, so git showed the move as delete + create instead of a rename. Larger files with the same 5-line change detect cleanly. --find-renames=30% on git log / git diff lowers the threshold if needed.
Bulk rewrites with perl + binmode. Same recipe as arch-cleanup: perl -i -pe 'BEGIN{binmode(ARGV);binmode(STDOUT);} s|old|new|g' preserves CRLF on Windows. 168 rewrites in this epic (142 include paths + 26 PCH switches) all clean — no stray LF/CRLF flips in git diff.
ImGui is engine infrastructure, not editor-exclusive. The spec's initial assumption was that ImGui only lives in editor code. Real dependency is in 6 engine files across rendergraph, frame debugger, and platform. The correct editor-cleanness invariant is luthien/|Luthien grep, not imgui grep.
Preserved a functioning editor throughout. Every commit landed build-clean with full editor parity. No multi-commit broken state — sub-staging (B1/B2/B3 contingency) wasn't needed because the hooks interface was designed before any engine code was rewritten, keeping each commit atomic.
Build Verification
- 4 work commits (plus the kickoff
docs(epic): add arch-target-split spec) - All sub-tasks build Debug x64 clean at HEAD; warnings are pre-existing noise (
C4267size_t→uint32_t,C4244chrono::rep,C4996getenv,LNK4006Vulkan import-descriptor duplicates) - 3-...
v1.7.0 — arch-renderer-split
v1.7.0 — arch-renderer-split
Date: 2026-04-18
Commits: 9
Issue: #77
Overview
Phase 3–4 of the architecture refactor. Dissolve the ~3 500-LOC RenderingSystem god-class (which lived in scene/systems/ but was the de-facto renderer) into focused classes under renderer/. Consolidate scattered animation code into a new animation/ module.
After this epic, scene/systems/RenderingSystem is a ~350-LOC ECS glue layer; all graphics resources (pipelines, descriptor sets, SPIR-V, UBOs, SSBOs, IBL maps, bloom textures, GPU timers, named-texture registry, preview textures) live on RenderPipeline in renderer/; animation data has its own top-level module.
See the multi-epic plan: docs/development/ARCH-REFACTOR-PLAN.md.
Sub-Tasks
| # | Sub-task | Commit |
|---|---|---|
| A | Extract FrameTargets from RenderingSystem |
refactor(render): extract FrameTargets from RenderingSystem |
| B | Extract DrawListBuilder |
refactor(render): extract DrawListBuilder from RenderingSystem |
| C | Extract LightGatherer + CascadeBuilder |
refactor(render): extract light gathering + CSM cascade build |
| D | Extract RenderPipeline (graph assembly) |
refactor(render): extract RenderPipeline graph assembly |
| E1 | Thin RenderingSystem — init migration |
refactor(scene): thin RenderingSystem (init migration) |
| E2 | Thin RenderingSystem — per-frame + debug migration |
refactor(scene): thin RenderingSystem (per-frame + debug migration) |
| E3 | Thin RenderingSystem — field ownership migration |
refactor(scene): thin RenderingSystem (field ownership migration) |
| — | Fix skybox reload on project change | fix(editor): reload skybox on project change |
| F | Consolidate animation/ module |
refactor(animation): consolidate animation module |
Directory Changes
New files
renderer/FrameTargets.{h,cpp}— owns persistent scene textures (SceneColor/Depth, LDR, EntityID, Selection {mask,depth})renderer/DrawListBuilder.{h,cpp}— walks ECS once, partitions entities into opaque/cutout/transparent draw bucketsrenderer/draw/DrawList.h— bucket struct with tri-count summaryrenderer/lighting/LightGatherer.{h,cpp}— ECS →LightUniforms+ shadow configrenderer/lighting/CascadeBuilder.{h,cpp}— PSSM split + per-cascade ortho fitrenderer/RenderPipeline.{h,cpp}— graph assembly + all graphics resources (~3 150 LOC)
New folder
animation/— housesAnimationClip.h,Skeleton.h,BoneMatrixBuffer.{h,cpp},AnimationController.h(all moved viagit mv, history preserved)
Moved
- 5 files moved into
animation/(fromrenderer/andscene/) - 19 caller files bulk-rewrote their includes
Added to lighting/LightTypes.h
DirectionalLightShadowParams— per-frame shadow config fromComponent::DirectionalLightCascadeData— per-frame CSM output (4 matrices + split view-Z + texel sizes)
Shrinkage
| File | Before | After | Δ |
|---|---|---|---|
scene/systems/RenderingSystem.cpp |
~3 500 LOC | 348 LOC | −90% |
scene/systems/RenderingSystem.h |
~490 LOC | 194 LOC | −60% |
The remaining RenderingSystem is the ECS glue layer the spec targeted: Update() orchestration, UpdateLightUniforms() (CPU-side gather + cascade build), mouse picking, editor-facing getters/setters, frame-debugger state, shader hot-reload dispatch. The ~100-LOC aspirational target was approached but not hit — RenderingSystem still holds FrameTargets, DrawListBuilder, LightGatherer, CascadeBuilder, FrameDebugger, CameraParams, and editor state, because those are all scene-level inputs per the spec's target shape.
Key Design Decisions
Bidirectional friend class kept
friend class RenderPipeline; on RenderingSystem allows RenderPipeline methods to read RS-side per-frame inputs (CameraParams, ShadowParams, Cascades, FrameTargets, FrameDebugger, DrawList, editor toggles) without widening the public API to ~25 getters. The coupling is intrinsic: Pipeline consumes scene state that by design lives on the ECS-glue layer. Dropping friend was an aspirational goal, not worth the verbosity. RenderingSystem.h drops most Vulkan includes as a result — only VkSampler (via FrameDebugger) and VkImageView (preview getter return types) remain.
Sub-task D staging
Moving the entire graph-assembly chain + all 13 Add*Pass methods + CollectSelectedHandles + CaptureSnapshot was a 1 500-LOC mechanical move. Executed atomically via perl rewrite: RenderingSystem:: → RenderPipeline:: on class qualifiers, m_X → m_System.m_X on member accesses (since fields still lived on RS at that point). Friend class granted access. E1–E3 later inverted the perl rewrite for fields that migrated.
Sub-task E sub-staged into E1/E2/E3
The "≤ 100 LOC" target in the spec required ~2 000 LOC of migration across 4 files — too risky for a single commit. Split into three atomic sub-commits:
- E1 — init routines + ctor/dtor →
Pipeline::Initialize/Shutdown/OnResize.RenderingSystem.cpp: 3 073 → 1 194 LOC. - E2 — per-frame
Update*helpers +BuildGPUObjectBuffer+ debug blit/preview helpers moved. 1 194 → 324 LOC. - E3 — field ownership migration. ~40 graphics fields moved from RS to Pipeline; pass files bulk-rewrote
m_System.m_X→m_X. RS-retained fields kept theirm_System.prefix via negative-lookbehind perl.
k_MaxGPUObjects + indirect-region constants moved
The static constexpr u32 k_MaxGPUObjects = 4096; constant (plus k_IndirectRegionCount / k_IndirectRegionStride) migrated from RenderingSystem:: private statics to RenderPipeline:: public statics in E3. Pass files consume them via RenderPipeline::k_MaxGPUObjects.
animation/ module carved out (sub-task F)
AnimationClip, Skeleton, BoneMatrixBuffer (data + GPU buffer) moved from renderer/. AnimationController (blend controller) moved from scene/. AnimationSystem stayed in scene/systems/ because it walks Component::Animation + Component::BoneAttachment — ECS territory by design.
Skybox Init Bugfix
Partway through E3 verification, the skybox rendered black at startup until the user manually reloaded it via the editor. Root cause: RenderingSystem::ctor runs during App::App before any project loads. FileSystem::ResolveAsset("textures/environment.hdr") falls back to engine assets (s_HasProject = false), but the engine ships no HDR — only samples/assets/textures/environment.hdr exists. IBL::Precompute hits its fallback path, returns 1×1 placeholders without skybox SPIR-V, and CreatePipelines silently skips the skybox pipeline. Likely pre-existing but surfaced during refactor verification.
Fix: Editor::OnProjectChanged (tail of App::LoadProject) now re-resolves the settings' skyboxPath against the freshly set project root and calls ReloadSkybox if the resolved path exists.
Lessons
Scope ambition vs. commit granularity. The spec's "≤ 100 LOC" target for RenderingSystem.cpp was the right aspiration but couldn't be delivered atomically. The epic spec itself suggested sub-staging D into D1/D2/D3 if needed; E adopted the same pattern. Per-commit build verification is non-negotiable; the single-commit target would have been a multi-day breakage risk.
Perl lookbehind saves double-rewrites. Each sub-task's bulk rewrite had to avoid re-rewriting already-prefixed accesses. (?<!m_System\.)\bm_X\b is the pattern — it runs idempotently over a file that's been partially rewritten, so repeat runs are safe.
Refactor verification surfaces pre-existing bugs. The skybox-black issue likely existed before the epic — the refactor just put eyes on it. Worth a post-epic pass on anything that looks "working" but might have a similar latent flaw.
File moves with git mv preserve blame. Sub-task F's 5-file move kept git log --follow history intact; the diff showed 0-line changes on the moved files themselves. Trivial but important for ongoing code archaeology.
Build Verification
- Debug x64 builds clean after every sub-task (9 incremental builds)
- Premake regeneration clean on each sub-task
- No new warnings (only pre-existing C4267 / LNK4006 noise)
Luth.lib+Luthien.exeartifacts produced
Runtime verification (user-confirmed)
- A: FrameTargets resize + render-pass parity ✅
- B: tri-count + opaque/cutout/transparent ordering preserved ✅
- C: CSM cascades split identically ✅
- D: full visual parity + Frame Debugger works ✅
- E1/E2/E3: startup, shutdown, resize, hot-reload, picking, capture ✅
- F: skinned mesh animation + bone debug overlay ✅
- Skybox fix: loads correctly when project opens ✅
v1.6.0 — arch-cleanup
v1.6.0 — arch-cleanup
Date: 2026-04-18
Commits: 8
Issue: #76
Overview
Phase 1–2 of the architecture refactor. Low-risk mechanical moves that clean up folder misalignment, preparing the tree for the larger arch-renderer-split and arch-target-split epics. No behavior change — all work is structural.
See the multi-epic plan: docs/development/ARCH-REFACTOR-PLAN.md.
Sub-Tasks
| # | Sub-task | Commit |
|---|---|---|
| A | Extract events/ from platform/ |
refactor(events): extract event types from platform/ |
| B | Disperse utils/ into editor/core/resources |
refactor(utils): disperse utils/ into editor/core/resources |
| C | Move FrameData.h from renderer to core |
refactor(core): move FrameData from renderer to core |
| D | Rename Systems→SystemRegistry, fix ownership |
refactor(scene): rename Systems->SystemRegistry, fix ownership |
| E | Split Components.h into components/ subfolder |
refactor(scene): split Components.h into components/ subfolder |
| F | Normalize POD component field naming | refactor(scene): normalize POD component field naming |
| G | Subdivide renderer/ into concept folders |
refactor(render): subdivide renderer/ into concept folders |
| H | Extract LightTypes.h from RenderingSystem |
refactor(render): extract LightTypes from RenderingSystem |
Directory Changes
New folders
events/— extracted fromplatform/editor/widgets/— fromutils/(Icons, ImGuiUtils)scene/components/— granular component headersrenderer/resources/— Buffer, Mesh, Model, Texturerenderer/material/— Material, MaterialSystemrenderer/shader/— Shader, ShaderCompiler, ShaderLibraryrenderer/pipeline/— PipelineManagerrenderer/lighting/— IBLPrecompute, LightTypes (new)renderer/settings/— GTAOSettings, PostProcessSettingsrenderer/draw/— DrawCommand
Removed folders
utils/— dispersed
Renamed
scene/System.h→scene/systems/ISystem.h(classSystem→ISystem)scene/Systems.{h,cpp}→scene/systems/SystemRegistry.{h,cpp}(classSystems→SystemRegistry)utils/LuthIcons.h→editor/widgets/Icons.hutils/ImGuiUtils.h→editor/widgets/ImGuiUtils.hutils/CustomFormatters.h→core/LogFormatters.hutils/ImageUtils.cpp→resources/ImageUtils.cpprenderer/FrameData.h→core/FrameData.h
Moved
- 25 files from
renderer/top level into concept subfolders (sub-task G) - 7 event files from
platform/toevents/(sub-task A)
Key Design Decisions
SystemRegistry ownership fix
Previous Systems::AddSystem<T>() emplaced a unique_ptr<T> into a vector<shared_ptr<System>> — implicit conversion masked an ownership-model bug. Fixed:
- Storage:
vector<unique_ptr<ISystem>>(manager is sole owner) GetSystem<T>()returns non-owningT*(wasshared_ptr<T>)- All 5 panel members updated from
shared_ptr<RenderingSystem>to rawRenderingSystem*
POD component field naming
struct ID { UUID Value; } chosen over struct ID { UUID ID; } to avoid struct-name/member-name shadow collision. Value applied uniformly across ID, Tag, Parent, Children (newtype wrapper convention). 14 caller files updated via .m_X → .Value.
Components.h umbrella
Split into 6 granular headers (Common, Transform, Camera, Rendering, Lights, Animation) but kept Components.h as a 7-line umbrella #include. Existing #include "luth/scene/Components.h" callsites required no changes.
LightTypes.h extraction
DirectionalLightData, PointLightData, LightUniforms, k_ShadowCascadeCount, k_ShadowResolution moved from RenderingSystem.h (which includes Vulkan + scene + renderer headers) to a pure-data header with only core/LuthTypes.h + glm.hpp as dependencies. Shader reflection, tests, and future tools can include freely.
Lessons
Bulk text rewrites: use perl with BEGIN{binmode}, not sed.
sed -i on Git Bash for Windows silently strips CRLF → LF on every file it touches, even files with no match — produced 170+ bogus "modified" entries in git status during sub-task A. perl -i -pe 'BEGIN{binmode(ARGV);binmode(STDOUT);} s|...|...|g' reads/writes in binary mode and preserves CRLF byte-for-byte.
Ownership bugs hide behind implicit conversions. The unique_ptr→shared_ptr container mismatch in Systems compiled cleanly because vector::emplace_back accepts anything convertible to the element type. Worth grepping for "raw pointer returned from smart container" patterns as a class of future bugs.
Build Verification
- Debug x64 builds clean after every sub-task (8 incremental builds)
- Premake regeneration clean (
scripts\setup\setup_windows.batequivalent) - No new warnings
Luth.lib+Luthien.exeartifacts produced
v1.5.0 — gtao
v1.5.0 — GTAO (Ground Truth Ambient Occlusion)
Version: v1.5.0 | Date: 2026-04-17 | Epic: #58 | Deps: compute-gpu-culling (v1.2.0)
What Was Built
Screen-space ambient occlusion via Ground Truth AO (Jimenez et al. 2016, XeGTAO inspiration). Replaces Luth's flat ao = 1.0 default ambient term with a physically-grounded occlusion signal that modulates the split-sum IBL contribution, dramatically improving the grounding of objects in scenes dominated by indirect light. Compute-only, mip-0 only (no LDS mip chain), no temporal accumulation yet — an MVP that slots into the existing render graph + compute pass infrastructure shipped in compute-gpu-culling.
Pipeline (per frame, after shadows, before forward shading):
| Stage | Input | Output | Notes |
|---|---|---|---|
| DepthPrepass (new, opaque-only forward) | Indirect draws | SceneDepth (D32) |
Enables GTAO to read depth before PBR shades; GeometryPass now loads depth with LESS_EQUAL |
| GTAODepthPrefilter (compute) | SceneDepth |
GTAOLinearDepth (R32F, half-res) |
2×2 min-gather + perspective linearize; sky pixels clamped to farZ |
| GTAOMain (compute) | GTAOLinearDepth |
GTAORawAO (R8, half-res) |
Horizon-based integral, 2–8 slices, IGN jitter; VS normals reconstructed from depth derivatives |
| GTAODenoise (compute) | GTAORawAO + GTAOLinearDepth |
GTAOFinal (R8, half-res) |
3×3 tent + bilateral depth weight (~10% relative sigma) |
GeometryPass (pbr.frag) |
GTAOFinal (Set 0 binding 4) |
SceneColor |
ambient *= gtaoAO — multiplies material occlusion if present |
- Z-prepass. Depth-only forward pass using the camera region of the existing indirect buffer; position-only vertex shader (rigid + skinned variants), reuses
shadowDepth.fragas null fragment. GeometryPass switched toLOAD_OP_LOAD+VK_COMPARE_OP_LESS_OR_EQUALso opaques pass on equal-z. Unblocks both GTAO and the futureforward-plus(#54) cluster pipeline. - GTAOSettings. Runtime-tunable struct nested in
PostProcessSettings:enabled / halfRes / visualize,intensity / radius / falloff / power,sliceCount (2/3/4/8) / stepsPerSlice. Editor section inRenderPanelwith XeGTAO-recommended defaults (radius 0.5 m, falloff 0.615, power 2.0, 3 slices × 2 steps). Mirrored to GPU via a 48-byte std140GTAOUBO, refreshed each frame inUpdateGTAOUBO(). - Set 0 expansion. Two new bindings sampled by
pbr.frag: binding 4 =sampler2D gtaoTex, binding 5 =GTAOUBO. Descriptor writes live inUpdateAODescriptors(called fromInitAOResourcesand afterResizerecreates the half-res textures). - Frame Debugger support.
GTAOLinearDepth / GTAORawAO / GTAOFinalregistered as tracked render targets. AddedR8_UnormandR32_Floatto bothRG::TextureFormatandFrameDebugger::ToVkFormatso archive images allocate at native format instead of silently falling through to RGBA8_UNORM (which previously caused rainbow-banding previews for both GTAO buffers). - Visualize mode.
gtao.visualizetoggles the PBR shader to output the raw GTAO buffer as the scene color — isolates AO contribution for tuning without writing a dedicated debug pass. - Always-on chain. GTAO runs every frame regardless of
enabled; the shader'senabledflag gates the modulation insidepbr.frag. This avoids first-frame layout-transition ordering issues (the Set 0 binding-4 sampler always sees a validSHADER_READ_ONLY_OPTIMALlayout).
Bugs Fixed Mid-Epic
- Frame Debugger preview refresh required cascade-click round-trip.
m_DepthPreviewKey(the Phase 14F depth-blit cache key) was never reset when a new capture began — same-archive re-selections after recapture skipped the blit and served stale texture. Matchedm_PerDrawPreviewKey's invalidation atBeginCapturetime. - Archive sink format-reinterpretation.
FrameDebugger::ToVkFormatis a parallel copy ofRenderGraph::ToVkFormatand was missing cases for the new GTAO formats, sovkCmdCopyImagebetween the source image and the RGBA8 fallback destination did a raw byte reinterpretation — visible as colored horizontal banding over bothGTAOLinearDepth(R32_SFLOAT) andGTAORawAO(R8_UNORM) previews. Fixed by adding the missing cases to both maps + toRG::TextureFormat.
Files Added / Modified
New:
luth/assets/shaders/depthPrepass.vert+depthPrepass_skinned.vert— position-only Z-prepass vertex shadersluth/assets/shaders/gtao_depth_prefilter.comp— half-res min-gather + linearizeluth/assets/shaders/gtao_main.comp— horizon-based AO integralluth/assets/shaders/gtao_denoise.comp— 3×3 bilateral-depth denoiseluth/source/luth/renderer/GTAOSettings.h—GTAOSettings+GTAOUBO(std140)luth/source/luth/renderer/passes/DepthPrepass.cpp— camera-space Z-prepass (AddDepthPrepass)luth/source/luth/renderer/passes/AOPass.cpp—AddGTAODepthPrefilterPass/AddGTAOMainPass/AddGTAODenoisePassdocs/development/epics/gtao.md(deleted at epic close — this file supersedes it)
Modified:
luth/assets/shaders/pbr.frag— Set 0 bindings 4/5; GTAO modulation in ambient term; viz early-outluth/source/luth/scene/systems/RenderingSystem.{h,cpp}— Set 0 layout grows to 6 bindings;InitAOResources,UpdateAODescriptors,UpdateGTAOUBO; GTAO descriptor pool + sampler; frame-graph wiring (DepthPrepass → GTAO×3 → GeometryPass); tracked RTs; hot-reload rebuild of all three GTAO pipelines;m_DepthPreviewKeyinvalidation at capture startluth/source/luth/renderer/passes/GeometryPass.cpp— receiveSceneDepthhandle,LOAD_OP_LOAD+ LESS_EQUALluth/source/luth/renderer/{Texture.h,backend/vulkan/VulkanTexture.cpp}—R32_Floatformatluth/source/luth/renderer/rendergraph/{RenderGraphResources.h,RenderGraph.cpp,RenderResourceCache.cpp}—R32_Float+R8_Unormformatsluth/source/luth/renderer/FrameDebugger.cpp— archive format map gainsR32_Float+R8_Unormluth/source/luth/renderer/PostProcessSettings.h— nestedGTAOSettings gtao;luth/source/luth/editor/panels/RenderPanel.cpp— "Ambient Occlusion (GTAO)" collapsing sectionluth/source/luth/core/Version.h— bumped to v1.5.0
Out of Scope (Future Polish)
- XeGTAO parity. Full 5-mip LDS depth pyramid; edges texture for anisotropic denoising; multi-bounce approximation for diffuse; selective specular attenuation.
- Temporal accumulation. Reuses GTAO for free once the
fxaa-taaepic (#72) lands motion vectors + history buffer. - Half-res / full-res toggle. UI field exists but has no effect yet — always half-res. Trivial to wire once a use case demands it.
- PostProcessSettings serialization. GTAO settings reset to XeGTAO defaults per session, matching the existing bloom/tonemap fields. If editor persistence is wanted, extend
EditorSettingsto mirror the fields. - AO-aware specular. Currently multiplies diffuseIBL + specularIBL equally; XeGTAO weights specular with a separate cone-trace term derived from horizons.
v1.4.0 — Frame Debugger Sync Rework
Phase 14 — Frame Debugger Sync Rework
Version: v1.4.0 | Date: 2026-04-17 | Epic: #74 | Supersedes: #31
What Was Built
Reworked the Frame Debugger into a Unity-grade, GPU-true debugging tool. The old live-replay model — re-executing the pipeline up to N draws using the current uniforms/cull state, not the captured ones — was the root of a chronic sync bug where the displayed image never matched the selected step. Phase 14 deletes that path entirely and replaces it with archived per-pass images + on-demand per-draw replay.
- Archive sink (
IArchiveSink+ArchivedImage) —RG::RenderGraph::Executeinvokes a sink hook after every non-culled pass; the FrameDebugger sink emitsvkCmdCopyImagefor each tracked render target into a fresh, persistent staging image, restoring the source layout so the RG's compile-time barrier solver stays consistent. Tracked RTs in v1:SceneColor,SceneDepth,ShadowMap.C0..C3(one per cascade — ShadowPass imports per-layer views with names suffixed by cascade index),LDROutput,EntityID,BloomAFinal(~10 archives, ~50–100 MB at 1080p). - Frozen-state model — strict snapshot. The Frozen branch in
RenderingSystem::Updatedoes NOT rebuild or re-execute the live graph;m_LDROutputretains its captured contents and the editor's ScenePanel keeps showing the GPU-true frame. Each Frozen tick bit-compares the currentviewProjagainstcaptureViewProj— a mismatch flips the state machine back toCaptureRequestedand the next frame runs a fresh capture (Unity behavior: frozen on the captured image, auto-refresh on camera move). - Hierarchical EventNode tree — replaces the flat pass→draw list with
Group / Pass / Cascade / Drawkinds. An explicit prefix registry (FrustumCull.→ "Frustum Culling",ShadowPass.C<N>→ "Shadows" with cascade children) keeps grouping deterministic. Built once atFinalizeCaptureand stored onCapturedFrame::rootEvent. - Per-draw replay-then-copy — clicking a
GeometryPassdraw triggers anImmediateSubmitthat re-records the pass up to draw N intom_SceneColorand copies the result into a persistent RGBA16F preview the panel samples through ImGui. The live UBOs/SSBOs/indirect buffer are byte-stable in Frozen state (no live writers between captures onceAnimationSystemis paused — see bug fix below), so no separate frozen-buffer plumbing is required. Cache is keyed by(passIdx, localDrawIdx)and invalidated on everyBeginCapture/ExitCapture. - CSM cascade UI — Cascade nodes in the tree map to per-cascade single-layer depth archives.
BlitArchivedDepthToPreviewlinearizes the selected cascade through the existing depth blit pipeline into an RGBA8 preview, using[prev_split..this_split]for sensible per-cascade contrast. Detail panel surfaces capture-timecascadeSplitsViewZ,shadowBias,shadowNormalBias,cascadeTexelSize, and the fulllightSpaceMatrix[i]— values stamped fromm_Cached*at finalize so editing light parameters while frozen doesn't desync the readout. - Lifetime safety — archive teardown deferred via
VulkanContext::PushDeletionso in-flight ImGui frames sampling archive views can complete before the views/images are freed. Panel descriptor caches keyed byVkImageViewpointer (not archive index) so recaptures with overlapping indices always trigger freshImGui_ImplVulkan_AddTexturecalls. All archive frees route throughVulkanAllocator::FreeImageto keep the editor'sMemoryTrackerGPU counter in sync with VMA.
Bugs Fixed Mid-Phase
- Cache key collision across captures — same
(passIdx, drawIdx)after a recapture meant the second click on the same draw was a cache hit and the panel served stale preview content. Fixed by invalidatingm_PerDrawPreviewKeyon everyBeginCaptureandExitCapture. - MemoryTracker drift — deferred destroy lambda originally called
vmaDestroyImagedirectly, bypassingVulkanAllocator::FreeImage(which is what firesMemoryTracker::RecordFree). VMA freed the GPU memory but the editor's GPU counter only ever saw allocations. Fixed by routing all archive + per-draw + depth-preview destruction throughVulkanAllocator::FreeImage. - Click handler swallowed by right-aligned annotation —
ImGui::IsItemClickedwas called after theSameLine + TextDisabledannotation, so it queried the disabled label (unclickable) instead of the tree node. Fixed by capturingclickedThisNodeimmediately afterTreeNodeEx. - Cascade output "no preview" — initial tracked-RT set registered
"ShadowMap"but ShadowPass imports per-cascade resources named"ShadowMap.C<i>", so the sink filtered every cascade write out and cascade nodes hadarchivedImageIndex = -1. Fixed by registering all fourShadowMap.C0..C3names.BlitArchivedDepthToPreviewalso needed a fallback toarchive.viewwhenarchive.layers <= 1because each cascade archive is single-layer (the source is a per-layer view onto the shared 4-layer image), but the EventNode'sarchiveLayercarries the cascade index 0..3 for detail-panel lookups. - Animated meshes drifting between draw replays —
AnimationSystem::Updateticked every frame regardless of debugger state, soBoneMatrixBuffer's contents changed between consecutive per-draw replays and each draw rendered a different pose. Fixed by early-returning fromAnimationSystem::UpdatewhenRenderingSystem::GetDebuggerState() == Frozen. Mirrors Unity's pause-while-inspecting behavior; will fold into a scene-level pause flag once Phase 16 (physics) and Phase 15 (play mode) land. - Timings vanish when scene gains its first model —
m_GPUTimers.Init(16)was below the live frame's non-culled pass count. With no models, ShadowPass.C0..C3 are dead-pass-culled, total ≤16. Adding one model un-culls them, total >16, andGPUTimerPool::ReadResultsearly-returns-1for every slot whenpassCount > maxPasses. Bumped capacity to 64 (current frame ≈19 passes; headroom for GTAO etc.).
Files Modified / Added
New:
luth/source/luth/renderer/rendergraph/ArchivedImage.{h,cpp}— staging-image RAII + lazy per-layer view cacheluth/source/luth/renderer/rendergraph/IArchiveSink.h— RG post-pass hook interfaceluth/source/luth/renderer/rendergraph/FrameEventTree.{h,cpp}— hierarchical event model
Modified:
luth/source/luth/renderer/rendergraph/RenderGraph.{h,cpp}—SetArchiveSink+ post-pass invocationluth/source/luth/renderer/rendergraph/FrameCapture.h—archivedImages,passArchives,captureViewProj,rootEvent, cascade cache (splits/bias/texel/light-space matrices)luth/source/luth/renderer/FrameDebugger.{h,cpp}—IArchiveSinkimpl, archive lifecycle, deferred teardownluth/source/luth/scene/systems/RenderingSystem.{h,cpp}— gutRenderCapturedFrame(~350 LoC), rewrite Frozen branch,FinalizeCapture,ReplayPassUpToDraw,BlitArchivedDepthToPreview,EnsurePerDrawPreviewTexture,EnsureDepthPreviewTexture, GPU timer pool bumped to 64luth/source/luth/scene/systems/AnimationSystem.cpp— early-return when Frame Debugger is Frozenluth/source/luth/editor/panels/FrameDebuggerPanel.{h,cpp}— recursiveDrawEventNode, archive / per-draw / depth preview paths, cascade detail blockluth/source/luth/core/Version.h— bumped to v1.4.0
Out of Scope (Future Polish)
- Per-draw replay for non-
GeometryPasspasses (ShadowPass.C<i>, fullscreen passes). Today the panel falls back to the pass-output archive for those. - 3D viewport overlay of cascade frustum slices.
- 4-thumbnail strip for A/B cascade comparison (cascade detail panel currently shows one cascade at a time).
- HDR tonemapping for the per-draw preview (raw RGBA16F surfaces, clipping above 1.0 is annotated in the panel).
- Scene-level pause flag — replace the
AnimationSystem↔RenderingSystemdirect query with a singleScene::IsPaused()flag oncePhysicsSystem/PlayModeland (Phases 15–16).
v1.3.0 — Cascaded Shadow Maps
Phase 13 — Cascaded Shadow Maps
Version: v1.3.0 | Date: 2026-04-16 | Epic: #60
What Was Built
Replaced the single 2048² directional-light shadow map with a 4-cascade PSSM system:
- 4-layer shadow array —
VK_IMAGE_VIEW_TYPE_2D_ARRAYD32 texture (2048×2048×4); per-layerVkImageViewfor ShadowPass writes; full-array view for PBR sampling. - CSM uniform plumbing —
GlobalUniformsextended withlightSpaceMatrix[4],cascadeSplitsViewZ,shadowBias,shadowNormalBias,cascadeTexelSize,cascadeBlendWidth,debugVisualizeCascades(std140, 544 B). - PSSM splits — Engel practical formula (
splitLambda = 0.5default) converted to view-space Z. - Per-cascade ortho fitting — Sascha Willems bounding-sphere approach: centroid of 8 sub-frustum corners,
radius = ceil(r * 16) / 16,glm::lookAt+ symmetricglm::ortho(-r, r, -r, r, 0, 2r). Rotation-invariant and shimmer-resistant. - ShadowPass multi-layer — 4 ×
AddShadowPasscalls, each with per-layer view andcascadeIndexpush constant. - Per-cascade GPU culling — Indirect buffer region per cascade; 5 cull dispatches (camera + 4 cascades); frustum planes extracted via Gribb-Hartmann from each
lightSpaceMatrix[i]. - PBR cascade selection —
viewZ→ primary cascade; inside-test fall-through loop for robustness; 3×3 PCF viasampler2DArrayShadow; cascade blend at transition zone; per-cascade depth + normal bias scaled bycascadeTexelSize. - Debug viz —
DebugVisualizeCascadesflag tints fragments by cascade index (red/green/blue/yellow).
Known Issue — Coverage Gaps
A light-direction-dependent coverage bug persists: large ground-plane regions fail the ProjectInCascade inside-test (specifically proj.z < 0), appearing unlit. The bug is not in the cascade-fit math — Sascha's verbatim reference implementation also produces the symptom. Cascade-tint visualization confirms the fit geometry is correct; the failure is in shadow-map sampling.
Prioritized suspects for next session:
- UBO round-trip integrity — verify shader receives the correct
lightSpaceMatrixvalues (sentinel matrix test). - Shadow pass
cullMode = VK_CULL_MODE_FRONT_BIT— Sascha's reference uses back-face cull; mismatch may corrupt depth writes. - Gribb-Hartmann near plane uses GL clip convention (
row3 + row2) rather than Vulkan (row3); too permissive for culling but worth fixing.
Recommended tooling: Frame Debugger needs to be updated before tackling this (it was unavailable during Phase 13 debugging and would have resolved the issue quickly).
Files Modified
luth/source/luth/scene/systems/RenderingSystem.{h,cpp}— GlobalUniforms, shadow resources, cascade math, per-cascade cull dispatchesluth/source/luth/renderer/passes/ShadowPass.cpp— per-layer view, cascadeIndex push constantluth/source/luth/renderer/passes/CullPass.{h,cpp}— destOffset arg, 5 named dispatchesluth/source/luth/renderer/passes/GeometryPass.cpp— all-layer barrier before geometryluth/source/luth/renderer/backend/vulkan/VulkanTexture.{h,cpp}—CreateLayerViewluth/source/luth/scene/Components.h—DirectionalLightCSM fieldsluth/source/luth/scene/SceneSerializer.cpp— persist CSM fieldsluth/source/luth/editor/panels/InspectorPanel.cpp— basic CSM inspector controlsluth/assets/shaders/pbr.frag— cascade selection, blending, PCF, debug tintluth/assets/shaders/shadowDepth.vert—cascadeIndexpush constantluth/assets/shaders/shadowDepth_skinned.vert— sameluth/assets/shaders/gpu_cull.comp—destOffsetpush constantluth/source/luth/core/Version.h— bumped to v1.3.0
v1.2.0 — Compute + GPU Culling
Highlights
GPU-driven rendering infrastructure: compute pass support in the render graph, GPU frustum culling via compute shader + indirect draw, and migration of every graphics pass off vkCmdDrawIndexed to vkCmdDrawIndexedIndirect. This is the keystone phase that unlocks GTAO, Forward+, HZB occlusion, and GPU particles in later phases. (Epic #55)
What's Included
- Render graph compute + buffer infrastructure —
BufferDesc/BufferHandle/BufferBarrier, 5 newResourceStatevalues (ComputeRead/Write,StorageBufferRead/Write,IndirectRead),AddComputePass<Data>(),VkBufferMemoryBarrier2emission, VMA-backed pooled storage buffers. VKComputePipelinewrapper — clean abstraction overvkCreateComputePipelineswith pipeline layout + push constants, used for both GPU cull and IBL precompute.- GPU frustum cull compute pass —
gpu_cull.comp(256 invocations/group) tests bounding spheres against 6 frustum planes; setsinstanceCount=0on culled indirect commands (no GPU-side compaction — hardware skips zero-count draws). - GPUObjectData SSBO (Set 5) — 112-byte std430 struct per object. All vertex shaders read via
objects[gl_BaseInstance]; push constants removed from geometry, shadow, PBR, and skinned variants. - Indirect draw conversion —
GeometryPassandShadowPassboth usevkCmdDrawIndexedIndirect, grouped by (VB, IB, pipeline). Shadow reuses the main-camera cull results (per-cascade culling deferred to Phase 13). - IBLPrecompute refactor — ad-hoc
vkCreateComputePipelinesreplaced with persistentVKComputePipelineinstances. - Frame Debugger extensions —
DispatchKindenum,CaptureIndirectDraw(),CaptureComputeDispatch(),[C]/[I]prefixes in the panel tree, compute/indirect metadata in detail view.
Descriptor Sets (final after Phase 12)
| Set | Content |
|---|---|
| 0 | GlobalUniforms + shadowMap + IBL |
| 1 | Bindless textures |
| 2 | Material SSBO |
| 3 | Light UBO |
| 4 | Bone matrices SSBO |
| 5 | GPUObjectData SSBO (new) |