Convert Word documents to clean Markdown β from the command line.
One dependency Β· Zero config Β· Works everywhere
Install Β· Usage Β· Why Β· Contributing
docx2md wraps pandoc with sensible defaults so you can convert .docx files to Markdown without memorizing flags.
docx2md report.docx
# β report.docx β report.mdThat's it. No config files, no build step, no Node.js runtime. One script, one dependency.
curl -fsSL https://raw.githubusercontent.com/jschof1/docx2md/main/install.sh | bashgit clone https://github.com/jschof1/docx2md.git
cd docx2md
sudo make installcurl -fsSL https://raw.githubusercontent.com/jschof1/docx2md/main/docx2md -o /usr/local/bin/docx2md
chmod +x /usr/local/bin/docx2mdpandoc must be installed:
| Platform | Install |
|---|---|
| macOS | brew install pandoc |
| Ubuntu/Debian | sudo apt install pandoc |
| Fedora | sudo dnf install pandoc |
| Windows (WSL) | sudo apt install pandoc |
| Windows (native) | choco install pandoc |
| Arch | sudo pacman -S pandoc |
docx2md report.docx # β report.md
docx2md report.docx notes.md # β notes.md (custom output name)docx2md --images report.docx # images extracted to ./images/
docx2md -i assets report.docx # images extracted to ./assets/docx2md chapter1.docx chapter2.docx chapter3.docx
docx2md *.docxdocx2md -s report.docx | head -20 # preview first 20 lines
docx2md -s report.docx | wc -w # word count
docx2md -s report.docx > output.md # redirect to fileUSAGE
docx2md [OPTIONS] <input.docx...>
docx2md <input.docx> [output.md]
OPTIONS
-i, --images [DIR] Extract images into DIR (default: images/)
-s, --stdout Write Markdown to stdout instead of a file
-q, --quiet Suppress all output except errors
-w, --wrap MODE Line wrapping: none (default), auto, or preserve
-h, --help Show this help message
-V, --version Show version number
There are plenty of docx-to-markdown tools. Here's why this one exists:
- Zero config β no config files, no presets, no decisions to make
- One dependency β only pandoc, which you probably already have
- Batch mode β convert 50 docs in one command
- Image extraction β pull embedded images out with one flag
- Pipes β stdout mode works with
head,grep,wc, and everything else - Portable β pure bash, runs on macOS, Linux, WSL, anywhere with a shell
- Fast β no runtime, no daemon, no overhead
| docx2md | mattn/docx2md | microsoft/markitdown | |
|---|---|---|---|
| Dependencies | pandoc | None (Go binary) | Python + packages |
| Install size | ~5 KB | ~3 MB | ~50 MB |
| Batch mode | β | β | β |
| Image extraction | β | β | β |
| Stdout / pipes | β | β | β |
| Config needed | None | None | None |
| Language | Bash | Go | Python |
git clone https://github.com/jschof1/docx2md.git
cd docx2md
# Run tests
make test
# Lint
make lint
# Install locally
make install
# Uninstall
make uninstallDoes it convert .doc (old Word format)?
Not directly. Convert to .docx first with libreoffice --convert-to docx file.doc, then use docx2md.
What about tables, footnotes, and math? Pandoc handles all of these. Complex tables may need manual cleanup, but most convert cleanly.
Why wrap pandoc? Isn't this just pandoc -f docx -t markdown?
Yes, and that's the point. Nobody remembers those flags. docx2md report.docx is easier to type, easier to remember, and handles batch conversion and image extraction without reaching for the pandoc manual.
Contributions welcome. Please:
- Fork the repo
- Create a feature branch (
git checkout -b my-feature) - Commit your changes
- Open a pull request
Keep it simple β this tool's value is its simplicity.
MIT β use it however you like.
If this saved you time, consider giving it a β