HanDoc

TypeScript library for reading and writing Korean HWP/HWPX documents.

한글(HWP/HWPX) 문서를 읽고 쓰는 TypeScript 라이브러리.

Features

📖 Read HWPX files — tested on 349 real-world documents
✍️ Write HWPX files — programmatic document creation with builder API
📄 Read HWP 5.x binary files (OLE2/CFB format)
🔄 Round-trip preservation — parse → write → identical output
📊 Rich content extraction — tables, images, equations, shapes, headers/footers, footnotes
📐 Page layout — size, margins, columns, section properties
🔤 Full text extraction with per-section support
🔁 Format conversion — DOCX ↔ HWPX, HTML export, PDF export
👁️ Web viewer — React component for rendering HWPX documents
✏️ Web editor — ProseMirror-based WYSIWYG editor (in progress)

Quick Start

npm install @handoc/hwpx-parser @handoc/hwpx-writer

Read an HWPX file

import { HanDoc } from '@handoc/hwpx-parser';
import fs from 'fs';

const doc = await HanDoc.open(new Uint8Array(fs.readFileSync('document.hwpx')));

// Extract all text
console.log(doc.extractText());

// Access structured data
for (const section of doc.sections) {
  console.log(section);
}

// Images, tables, metadata
console.log(doc.metadata);   // { title, creator, language }
console.log(doc.images);     // [{ path, data, contentType }, ...]

Read an HWP 5.x file

import { readHwp, extractTextFromHwp } from '@handoc/hwp-reader';
import fs from 'fs';

const buf = new Uint8Array(fs.readFileSync('document.hwp'));
const text = extractTextFromHwp(buf);
console.log(text);

Create an HWPX from scratch

import { HwpxBuilder } from '@handoc/hwpx-writer';
import fs from 'fs';

const bytes = HwpxBuilder.create()
  .addParagraph('Hello World', { bold: true, fontSize: 20 })
  .addTable([['Name', 'Score'], ['Alice', '95']])
  .addParagraph('Second paragraph')
  .build();

fs.writeFileSync('output.hwpx', bytes);

Convert HWPX to DOCX

import { HanDoc } from '@handoc/hwpx-parser';
import { writeDOCX } from '@handoc/docx-writer';
import fs from 'fs';

const doc = await HanDoc.open(new Uint8Array(fs.readFileSync('input.hwpx')));
const docxBuffer = await writeDOCX(doc);
fs.writeFileSync('output.docx', docxBuffer);

Packages

This is a monorepo with 12 packages:

Core Packages

Package	Description	Version
`@handoc/document-model`	Shared TypeScript types and utilities for the HWP/HWPX document model	0.1.0
`@handoc/hwpx-core`	Low-level HWPX (OPC/ZIP) package reader	0.1.0
`@handoc/hwpx-parser`	Parse HWPX files into a structured document model — the main read API	0.1.0
`@handoc/hwpx-writer`	Generate HWPX files from a document model, with `HwpxBuilder` for easy creation	0.1.0
`@handoc/hwp-reader`	Read HWP 5.x binary format (OLE2/CFB) and extract text	0.1.0

Format Conversion

Package	Description	Version
`@handoc/docx-reader`	Parse DOCX files and convert to HWPX via HanDoc document model	0.1.0
`@handoc/docx-writer`	Convert HWPX documents to DOCX format	0.1.0
`@handoc/html-reader`	Parse HTML and convert to HWPX document model	0.1.0
`@handoc/pdf-export`	HWPX to PDF export via HTML rendering and Playwright	0.1.0

UI Components

Package	Description	Version
`@handoc/viewer`	React component for rendering HWPX documents in the browser	0.1.0
`@handoc/editor`	ProseMirror-based HWPX document editor	0.1.0

CLI

Package	Description	Version
`@handoc/cli`	CLI tool for HanDoc - inspect, extract, and convert HWP/HWPX documents	0.1.0

Roadmap

Level 1: HWPX Read/Write + HWP 5.x Read ✅ Complete

HWPX file parsing (ZIP → parts extraction)
HWPX header parsing (fonts, styles, paragraph styles)
HWPX body parsing (text, formatting, tables)
HWPX file writing (document model → ZIP)
HwpxBuilder API (programmatic document creation)
HWP 5.x binary reading (OLE2/CFB format)
Table parsing (cell merging, borders)
Image/OLE binary data extraction
Stats: 349/349 HWPX files parsed successfully, 221/221 HWP files

Level 2: Format Conversion ✅ Complete

HWPX → DOCX conversion
DOCX → HWPX conversion
HWPX → HTML conversion (standalone HTML)
CLI tool (convert, to-html commands)
Stats: 587 lines (docx-writer), 1,456 lines (docx-reader), 120 tests

Level 3: PDF Export ✅ Complete

HWPX → HTML → PDF (Puppeteer/Playwright)
Table, image, formatting HTML rendering
Base64 image embedding, CSS styling
CLI commands: handoc to-html, handoc to-pdf
Stats: 378 lines (pdf-export), real HWP file tests passed

Level 4: Web Viewer ✅ Complete

React component for HWPX rendering
Responsive layout, mobile support
Table rendering, image display
Section-based rendering
Stats: 235 lines, 13 tests

Level 5: Web Editor 🟡 In Progress

ProseMirror-based editor setup
HWPX → ProseMirror schema conversion
Basic editing (text, bold, italic, headings)
Table editing
Image insertion
Full WYSIWYG features
Stats: 271 lines, 21 tests

API Overview

`@handoc/hwpx-parser` — HanDoc

API	Description
`HanDoc.open(buf)`	Parse an HWPX buffer, returns `Promise<HanDoc>`
`doc.extractText()`	Get all text as a single string
`doc.extractTextBySection()`	Get text per section as `string[]`
`doc.sections`	Parsed section tree (paragraphs, tables, shapes, etc.)
`doc.header`	Document header (fonts, styles, char/para properties)
`doc.metadata`	`{ title, creator, language }`
`doc.images`	Embedded images with binary data
`doc.sectionProps`	Page size, margins, columns
`doc.headers` / `doc.footers`	Page headers and footers
`doc.footnotes`	Footnotes
`doc.warnings`	Parse warnings (non-fatal issues)

`@handoc/hwpx-writer`

API	Description
`writeHwpx(doc, original?)`	Serialize a document model to HWPX bytes
`HwpxBuilder.create()`	Fluent builder for creating documents from scratch
`.addParagraph(text, style?)`	Add text with optional `{ bold, italic, fontSize, align }`
`.addTable(rows)`	Add a table from a 2D string array
`.addImage(data, ext, w?, h?)`	Add an image
`.addSectionBreak()`	Start a new section
`.build()`	Returns `Uint8Array` of the HWPX file

`@handoc/hwp-reader`

API	Description
`readHwp(buf)`	Parse HWP 5.x binary, returns `HwpDocument`
`extractTextFromHwp(buf)`	Extract plain text from HWP binary
`openCfb(buf)`	Low-level OLE2/CFB reader
`parseRecords(stream)`	Parse HWP binary records

Stats

Packages: 12 (monorepo with Turborepo + pnpm)
Source Code: 7,809 lines (TypeScript)
Tests: 469 passed
Real Documents: 570/570 parsed (349 HWPX + 221 HWP)
Build: 12/12 packages ✅

Development

pnpm install
pnpm build      # Build all packages
pnpm test       # Run all tests
pnpm typecheck  # Type check

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

본 제품은 한글과컴퓨터의 HWP 문서 파일(.hwp) 공개 문서를 참고하여 개발하였습니다.

Name		Name	Last commit message	Last commit date
Latest commit History 267 Commits
.github/workflows		.github/workflows
audit		audit
comparison-12pm		comparison-12pm
comparison-20260225-1001		comparison-20260225-1001
comparison-2pm		comparison-2pm
comparison-6pm		comparison-6pm
comparison-8pm		comparison-8pm
comparison-latest		comparison-latest
comparison-midnight		comparison-midnight
comparison-new		comparison-new
comparison-quick-20260225-2001		comparison-quick-20260225-2001
comparison-quick-20260225-2151		comparison-quick-20260225-2151
comparison-v2		comparison-v2
comparison-v3		comparison-v3
comparison-v30		comparison-v30
comparison-v4		comparison-v4
comparison-v8		comparison-v8
comparison-v9		comparison-v9
comparison		comparison
docs		docs
examples		examples
fixtures		fixtures
packages		packages
scripts		scripts
test-output		test-output
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
COMPARISON-IMPROVEMENT.md		COMPARISON-IMPROVEMENT.md
CONTRIBUTING.md		CONTRIBUTING.md
COVERAGE_STATUS.md		COVERAGE_STATUS.md
DEMO_APP_SUMMARY.md		DEMO_APP_SUMMARY.md
END-OF-DAY-SUMMARY.md		END-OF-DAY-SUMMARY.md
F-GRADE-ANALYSIS-v36.md		F-GRADE-ANALYSIS-v36.md
F-GRADE-BLOCKERS.md		F-GRADE-BLOCKERS.md
FIX-SUMMARY.md		FIX-SUMMARY.md
FLOATING-ELEMENT-IMPACT.md		FLOATING-ELEMENT-IMPACT.md
HOURLY-SUMMARY-17.md		HOURLY-SUMMARY-17.md
IMPLEMENTATION_SUMMARY.md		IMPLEMENTATION_SUMMARY.md
IMPROVEMENT-LESSONS.md		IMPROVEMENT-LESSONS.md
IMPROVEMENT-PLAN-v26.md		IMPROVEMENT-PLAN-v26.md
LICENSE		LICENSE
NEAR-MISS-ANALYSIS.md		NEAR-MISS-ANALYSIS.md
PROGRESS-2026-02-23.md		PROGRESS-2026-02-23.md
PROGRESS-LOG.md		PROGRESS-LOG.md
PROGRESS-STATUS-2026-02-25.md		PROGRESS-STATUS-2026-02-25.md
PROGRESS-SUMMARY-2026-02-24.md		PROGRESS-SUMMARY-2026-02-24.md
README.md		README.md
RECENT-FIXES-IMPACT.md		RECENT-FIXES-IMPACT.md
REMAINING-ISSUES.md		REMAINING-ISSUES.md
STATUS-19H.md		STATUS-19H.md
TASK-021-SUMMARY.txt		TASK-021-SUMMARY.txt
V38-REGRESSION-ANALYSIS.md		V38-REGRESSION-ANALYSIS.md
VISUAL-DIFF-STRATEGY.md		VISUAL-DIFF-STRATEGY.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
test-fixture-analysis.js		test-fixture-analysis.js
test-pdf-gen.js		test-pdf-gen.js
test-pdf.ts		test-pdf.ts
test-regression.ts		test-regression.ts
test-runs.ts		test-runs.ts
test-stats.ts		test-stats.ts
test_5.js		test_5.js
test_5.mjs		test_5.mjs
test_5.sh		test_5.sh
test_5.ts		test_5.ts
test_pdfs.sh		test_pdfs.sh
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HanDoc

Features

Quick Start

Read an HWPX file

Read an HWP 5.x file

Create an HWPX from scratch

Convert HWPX to DOCX

Packages

Core Packages

Format Conversion

UI Components

CLI

Roadmap

Level 1: HWPX Read/Write + HWP 5.x Read ✅ Complete

Level 2: Format Conversion ✅ Complete

Level 3: PDF Export ✅ Complete

Level 4: Web Viewer ✅ Complete

Level 5: Web Editor 🟡 In Progress

API Overview

`@handoc/hwpx-parser` — HanDoc

`@handoc/hwpx-writer`

`@handoc/hwp-reader`

Stats

Development

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HanDoc

Features

Quick Start

Read an HWPX file

Read an HWP 5.x file

Create an HWPX from scratch

Convert HWPX to DOCX

Packages

Core Packages

Format Conversion

UI Components

CLI

Roadmap

Level 1: HWPX Read/Write + HWP 5.x Read ✅ Complete

Level 2: Format Conversion ✅ Complete

Level 3: PDF Export ✅ Complete

Level 4: Web Viewer ✅ Complete

Level 5: Web Editor 🟡 In Progress

API Overview

@handoc/hwpx-parser — HanDoc

@handoc/hwpx-writer

@handoc/hwp-reader

Stats

Development

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`@handoc/hwpx-parser` — HanDoc

`@handoc/hwpx-writer`

`@handoc/hwp-reader`

Packages