Skip to content

CDN urls updated pdf-parse/utils replaced with pdf-parse/node #31

Merged
mehmet-kozan merged 11 commits intomainfrom
development
Oct 19, 2025
Merged

CDN urls updated pdf-parse/utils replaced with pdf-parse/node #31
mehmet-kozan merged 11 commits intomainfrom
development

Conversation

@mehmet-kozan
Copy link
Copy Markdown
Owner

This pull request introduces several improvements and housekeeping updates across the codebase, focusing on documentation, build configuration, and API extraction. The most significant changes include the addition of new geometric and table-related classes to the public API, updates to documentation and CDN paths to reflect new directory structures, and refactoring of build and configuration files for better clarity and maintainability.

API and Documentation Enhancements:

  • Added new geometry and table-related classes (Line, LineStore, Point, Rectangle, Shape, Table, TableData, TableCell, TableRow) to the public API documentation, providing more advanced PDF structural analysis capabilities. [1] [2] [3] [4]
  • Introduced a new API report file for the worker module, documenting the CanvasFactory, getPath, and getData exports.

Documentation and Usage Updates:

  • Updated README.md to reflect new import paths for utilities and worker helpers (e.g., pdf-parse/node, pdf-parse/worker), revised CDN URLs to match new directory structure (dist/pdf-parse/web/), and clarified worker and canvas factory usage. [1] [2] [3] [4]
  • Updated worker documentation to use new CDN paths for browser worker files.

Build and Configuration Refactoring:

  • Refactored API Extractor configuration files to match new output directory conventions (dist/pdf-parse/esm, dist/pdf-parse/cjs, dist/node/esm, dist/node/cjs), enabled doc model and TSDoc metadata generation for node utilities, and renamed/moved configuration files for clarity. [1] [2] [3] [4] [5] [6] [7] [8] [9]
  • Removed obsolete or redundant configuration files, such as Vite and TypeScript configs for deprecated build targets. [1] [2] [3]

Quality and CI Improvements:

  • Updated Node.js version in GitHub Actions workflow to use version 22 for CI consistency.
  • Extended SonarCloud copy-paste detection exclusions to include the bin directory.

Editor and Project Settings:

  • Added new file associations in VSCode settings for additional API extractor and TSDoc metadata files.
  • Updated Biome configuration to reflect directory renaming (dist-browser to dist-web).

Moved shared CSS for demo HTML files into a new styles.css file and updated HTML files to reference it, reducing duplication. Removed all files from the examples directory. Updated package.json keywords for improved discoverability and set dependency versions to exact values.
Moved utility config files to project root and updated related paths in scripts and configs. Enhanced PDFParse to support custom page joiners and improved line break detection based on line height. Cleaned up test descriptions and documentation, and simplified Vitest config plugin usage and path aliases.
Moved the 'pdf-parse' alias below 'pdf-parse/utils' in vitest.config.ts and removed the vite-tsconfig-paths plugin. Also made a minor formatting change in tsconfig.json for the 'pdf-parse' path. This streamlines path resolution and removes unnecessary plugin usage.
Introduces vitest.config.package.ts for package-specific test configuration. Updates package.json to add a beta version, new test script 'test:p', and reorganizes devDependencies for improved test management.
Moved core source files to src/pdf-parse and node-specific files to src/node. Updated build outputs, TypeScript configs, and package.json exports to reflect new directory structure. Renamed utility configs and extractor configs for clarity. Updated test imports and removed legacy TableUtil and related test. Adjusted Vite and Vitest configs for new paths. Expanded API documentation for geometry and table types.
Introduces a new build script for worker code using esbuild, adds a worker entry point and canvas utility classes for use in worker environments, and updates the Vite config to copy the worker bundle. Also adds a new npm script for building the worker and makes minor improvements to integration test script.
Introduces api-extractor.worker.json and tsconfig.worker.json for generating type declarations and API reports for the worker build. Updates package.json exports and scripts to use the new build and type output structure. Enhances build-worker.mjs to handle type file copying and cleanup, and adds ESM build output. Test imports updated to use CanvasFactory instead of CustomCanvasFactory.
Updated all references from pdf.worker.js to pdf.worker.mjs for consistency with ES module usage. Added a TypeScript ambient module declaration for pdf.worker.mjs to prevent import errors. Improved worker loading logic with error handling and fallback in index.ts.
Renamed getDataUrl to getData in worker API and updated all references in documentation and examples. Improved worker path resolution for both CJS and ESM environments. Added troubleshooting examples for worker usage. Updated CDN URLs and usage instructions in README for clarity. Enhanced build script to handle import.meta.url replacement and improved type cleanup.
Deleted all files related to the custom worker and canvas build pipeline, including bin/canvas, bin/worker, scripts/rename-cjs.mjs, and vite.config.worker.ts. Updated package.json to remove unused build:worker.back and cleaned up scripts. Added 'pdf-parse/worker' alias in vitest.config.ts and updated VSCode settings for new metadata files. This refactor removes legacy worker/canvas build logic in favor of a new approach.
Renamed all 'browser' build outputs and references to 'web' for consistency across the codebase, including source files, build scripts, documentation, and example/demo imports. Updated related paths in .gitignore, biome.json, package.json, Vite config, and test/benchmark imports. Added new worker tests and adjusted test/benchmark structure for improved coverage and organization.
@mehmet-kozan mehmet-kozan self-assigned this Oct 19, 2025
@mehmet-kozan mehmet-kozan added breaking-change A change that breaks existing functionality or APIs configuration Relates to build, compiler, test, or CI settings labels Oct 19, 2025
@mehmet-kozan mehmet-kozan merged commit 8a7a044 into main Oct 19, 2025
11 checks passed
@mehmet-kozan mehmet-kozan deleted the development branch October 20, 2025 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change A change that breaks existing functionality or APIs configuration Relates to build, compiler, test, or CI settings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant