HwpxConverter is a Windows converter that parses HWPX (OWPML) documents and converts them to HTML.
It uses Hancom’s public OWPML SDK libraries to traverse the document structure (sections/paragraphs/runs/tables, etc.) and produces HTML in a form that’s easy to inspect, debug, and feed into downstream pipelines.
Typical use cases:
- HWPX → HTML conversion (primary goal)
- Preprocessing for RAG / search pipelines (HTML/text extraction as an intermediate format)
- Batch conversion tests for public-sector documents (tables/outlines/lists)
- Text extraction based on Paragraph / TextRun traversal
- Outline (Outline 1–10) → HTML tag mapping (e.g., h1–h6)
- Table (
<table>) layout + text rendering - List rendering policy: no numbering computation, keep
<ol>but display bullets only via CSS
Source: Official publication of the National Police Agency
(2026년도 미래치안도전기술개발사업 신규과제 선정계획 공고)
-
Input:
.hwpxonly -
Even with a
.hwpxextension, some files may fail to open if they are non-standard HWPX or corrupted- Examples: legacy HWP renamed to
.hwpx, institution-provided files with non-standard packaging, damaged archives
- Examples: legacy HWP renamed to
-
Currently targets Windows + Visual Studio 2022 (as the baseline environment)
If you want to run it without building from source, download the latest executable (HwpxConverter.exe) from GitHub Releases and run:
HwpxConverter.exe "input.hwpx" "output.html"- If the path/name contains spaces, quotes are strongly recommended.
This project requires Hancom’s OWPML SDK libraries.
- Windows 11
- Visual Studio 2022
- Platform: Win32 (x86) (based on current project settings)
Prepare the SDK from Hancom’s public repo:
Build that repo and obtain the required libraries (e.g. Owpml.lib, OWPMLApi.lib, OWPMLUtil.lib) and headers, then place them to match this repo’s expected layout (include/, lib/), following your .vcxproj include/library path configuration.
- Open the solution (
HwpxConverter.sln) in Visual Studio - Configuration:
Release - Platform:
Win32 - Build
After building, Release/HwpxConverter.exe will be produced.
HwpxConverter.exe "InputFile.hwpx" "OutputFile.html"- If the input file is not
.hwpx, the program prints an error and exits immediately. - If it is
.hwpxbut conversion fails, it prints guidance indicating the file may be non-standard or corrupted.
A practical local test layout:
test/cases/: input.hwpxfilestest/expected/: expected output (HTML) or baseline outputstest/out/: actual outputs generated by running the converter (recommended to gitignore)
Example:
HwpxConverter.exe "test/cases/table_only.hwpx" "test/out/table_only.html"Minimum recommended set (3 docs):
- Outline-only document
- Table-only document
- List-only document
- Indentation: spaces
- Encoding: UTF-8 recommended
- Commit example:
converter: add hwpx extension validation
See LICENSE for details.
- Hancom OWPML SDK / reference implementation: https://github.com/hancom-io/hwpx-owpml-model



