Skip to content

Releases: ciscoriordan/kindling

v0.14.0 — OEB 1.x OPF support

20 Apr 01:29

Choose a tag to compare

v0.14.0 — OEB 1.x OPF support (drop-in kindlegen compatibility for PyGlossary)

Accepts OEB 1.x OPF dictionaries that kindlegen understood but earlier kindling releases rejected. Motivated by #3 (PyGlossary drop-in replacement).

What's new

  • Body-form <idx:orth> headwords<idx:orth><b>headword</b></idx:orth> now parses alongside the existing <idx:orth value="headword"/> attribute form. Both are permitted by Amazon KPG §15.6; kindlegen accepted both. Previously, body-form dictionaries failed --no-validate builds with "No dictionary entries found".
  • Capitalized Dublin Core elements<dc:Title>, <dc:Identifier id="uid">, <dc:Creator>, etc. inside OEB 1.x <dc-metadata> / <x-metadata> wrappers are matched case-insensitively. The package unique-identifier now binds correctly to <dc:Identifier> (fixes a R16.1 false-positive).
  • <EmbeddedCover> as Method 3 — OEB 1.x <x-metadata><EmbeddedCover>cover.png</EmbeddedCover> now counts as a cover declaration alongside Method 1 (properties="coverimage") and Method 2 (<meta name="cover">).

Validator

  • R15.6 no longer flags body-form <idx:orth> with non-empty body text as empty.
  • R16.1 / R16.5 / R16.8 now match <dc:identifier> / <dc:date> / <dc:language> case-insensitively, so they recognise OEB 1.x capitalized DC elements.

Other

  • New crate-wide pub const DEFAULT_AUTHOR = "kindling". EXTH 100 is now always emitted, falling back to DEFAULT_AUTHOR when the OPF provides no <dc:Creator> (or provides an empty one), so the post-build readback no longer fails on PyGlossary-style input. The comic pipeline was using a private DEFAULT_AUTHOR constant that now points at the shared one.

Tests

  • New tests/fixtures/pyglossary_oeb1x_dict/ fixture (OEB 1.x OPF + body-form <idx:orth> + <EmbeddedCover> + inflections), exercised end-to-end by validate and build CLI integration tests.
  • 12 new unit tests covering the body-form <idx:orth> extractor, OEB 1.x OPF parsing, EmbeddedCover recognition, R15.6 body acceptance, and R16.1 case-insensitive <dc:Identifier> binding.
  • Full suite: 729/729 passing.

Full Changelog: v0.13.2...v0.14.0

v0.13.2 - Phase 0.1 follow-ups

16 Apr 02:47

Choose a tag to compare

Phase 0.1 follow-ups

Shared ExtractedEpub across builder and validator

The kindling build flow now parses the OPF exactly once per invocation. mobi::build_mobi_from_extracted(&ExtractedEpub, ...) is the new primary entry point; the existing build_mobi(opf_path, ...) is kept as a thin wrapper so all non-CLI callers (tests, comic pipeline, mobi_check) keep working without changes. Pre-flight validation and the MOBI builder now share one ExtractedEpub, eliminating a full OPF parse pass per build.

Kindlegen parity tests confirm byte-identical MOBI/AZW3 output. Zero behavior change for all existing fixtures.

CSS check consumes parsed lightningcss cache

src/checks/css_forbidden.rs refactored from its own text-scanning helpers (754 lines) to consuming ExtractedEpub::css_summary() (117 lines), a net -637 lines. All 7 rules (R6.13 through R6.e2) fire with byte-identical output on the error fixture. The parsed stylesheet cache added in 0.13.1 now has its first consumer.

Stats

  • 710 tests pass (665 lib + 32 cli + 7 parity + 6 roundtrip), 2 ignored
  • Zero cargo warnings

Full Changelog: v0.13.1...v0.13.2

Full Changelog: v0.13.1...v0.13.2

v0.13.1 - Validator quality and infrastructure

16 Apr 02:21

Choose a tag to compare

Validator quality and infrastructure

Profile-mask enforcement

validate() now post-filters findings through Rule::applies_to(profile), making the profile_mask field on every rule actually enforce. Previously rules declared a profile mask but all checks ran unconditionally.

R6.14 (forbidden position: fixed/absolute/sticky) scoped from ALL_PROFILES to Default|Dict - fixed-layout content legitimately uses absolute positioning. Eliminates 49 false-positive hits on the w3c/epub-tests corpus.

Fixed-layout itemref override

R11.2 and R11.7 now honor per-spine-item properties="rendition:layout-reflowable" overrides. Pages explicitly marked reflowable are no longer checked for viewport metas. Fixes 3 corpus false positives.

Corpus harness improvements

  • Pinned corpus SHA (45feac9) for reproducible baselines; mismatch prints a warning
  • SHA included in the JSON report

Lightningcss parsed cache

ExtractedEpub::css_summary(href) returns a CssSummary struct (imports, url refs, font faces, namespaces, media features, property names, forbidden positions) with caching. Future check modules can use this instead of re-scanning raw CSS text.

Test coverage

  • 3 new tests replace #[allow(dead_code)] on mobi.rs test helpers (strip_trailers, palmdoc_decompress, count_tag_balance)
  • 736 tests total (691 lib + 32 cli + 7 parity + 6 roundtrip), zero cargo warnings

Corpus baseline (v0.13.1, 201 tests)

All non-noise findings confirmed legitimate by manual investigation:

  • R11.2/R11.7 x6: SVG/PNG spine items without viewport (correct)
  • R17.1 x22: <script> and <iframe> tags (correct for Kindle)
  • R6.3 x19: <script> elements, full subset of R17.1 (correct)
  • R7.1 x6: files on disk but not in manifest (5 correct, 1 ambiguous EPUB 3.3 linked record)

Full Changelog: v0.13.0...v0.13.1

Full Changelog: v0.13.0...v0.13.1

v0.13.0 — Phase 2: epubcheck rule port

16 Apr 00:27

Choose a tag to compare

Phase 2: epubcheck rule port

117 active validation rules (from 22), porting the STEAL-grade subset of w3c/epubcheck that actually matters for Kindle/KDP ingestion. Pure EPUB 3.3 conformance is out of scope.

New rule clusters

  • Section 5 — TOC/NCX/NAV (R5.4–R5.11): NCX empty navPoint, mismatched dtb:uid, whitespace-only uids, missing page-list when page breaks exist, guide-reference sanity (accepts NCX as a legacy Kindle guide target), spine toc must name an NCX.
  • Section 6 — HTML/CSS parse and forbidden (R6.6–R6.17, R6.e1–R6.e2): XHTML well-formedness, DOCTYPE, encoding (Cluster B), lightningcss parse errors, forbidden position values, unresolved @import / url() / @font-face, @namespace and unsupported @media features (Cluster I).
  • Section 7 — Manifest/spine integrity (R7.1–R7.13): undeclared files, declared-vs-actual media-type magic, duplicate itemrefs, text/html-where-xhtml-expected, package file listed in its own manifest, etc.
  • Section 8 — OPF prefix and property grammar (R8.1–R8.10): prefix attribute syntax, manifest properties grammar, EPUB 3 gated.
  • Section 9 — Cross-references and dead links (R9.1–R9.12): image fragments, undefined id targets, CSS/image fragment anchors, SVG <use> without a fragment, whitespace URLs, path-traversal, data: / file: URLs, query-string in relative URLs, manifest href with fragment.
  • Section 10 — Image integrity (R10.4.3–R10.4.5): corrupt headers, extension/magic-bytes mismatch.
  • Section 11 — Fixed-layout EPUB (R11.1–R11.9, profile-gated Comic|Textbook): missing rendition:layout, missing viewport meta, invalid rendition:spread, viewport dimensions, layout conflicts.
  • Section 13 — OCF filename (R13.1–R13.5): forbidden characters, case-folding collisions, path length, percent-encoding.
  • Section 15 — DICT profile (R15.1–R15.7 Amazon legacy, R15.e1–R15.e7 EPUB 3 gated): dictionary-only rules for KDP dictionaries.
  • Section 16 — OPF metadata and package identity (R16.1–R16.8): unique-identifier integrity, W3CDTF dates, empty DC elements, UUID scheme, BCP47 language tags.

Infrastructure

  • Phase 0 validator refactor: Check trait, ExtractedEpub struct, Profile enum (Default/Comic/Dict/Textbook), Severity ladder, lightningcss raw-text cache.
  • 9 parallel cluster agents merged via pre-committed insertion markers so cherry-picks auto-merge (with some hand-resolved tail conflicts).
  • Real bugs surfaced and fixed during integration: resolve_relative escape-marker handling, kindle: scheme URL classification, XML-content detect_kind classification, comic builder OPF/NCX uid mismatch, R5.10 legacy Kindle pattern, test fixture viewport metas.

Test and tooling

  • 733 tests (from 473). 688 lib + 32 cli + 7 parity + 6 roundtrip.
  • Zero cargo warnings, zero ignored cross-ref integration tests.
  • Every new rule sanity-checked against tests/fixtures/clean_book, clean_dict, and ~/Documents/lemma/lemma_greek_en/lemma_greek_en.opf — no new rule fires on any known-good input.

Full Changelog: v0.12.0...v0.13.0

Full Changelog: v0.12.0...v0.13.0

v0.12.0

13 Apr 00:40

Choose a tag to compare

Full Changelog: v0.7.4...v0.12.0

v0.7.4

10 Apr 23:58

Choose a tag to compare

Full Changelog: v0.7.3...v0.7.4

v0.7.1

10 Apr 17:48

Choose a tag to compare

Full Changelog: v0.7.0...v0.7.1

v0.7.0

10 Apr 17:47

Choose a tag to compare

Full Changelog: v0.6.0...v0.7.0

v0.6.0

10 Apr 00:02

Choose a tag to compare

Full Changelog: v0.5.1...v0.6.0

v0.5.1

09 Apr 22:22

Choose a tag to compare

Full Changelog: v0.2.2...v0.5.1