This repository contains the source materials for the Digital Pāḷi Dictionary (DPD) Pāḷi courses, transformed from original Google Docs into a modern, searchable static website.
docs/: Markdown source files for all course materials.bpc/: Beginner Pāḷi Course (BPC) lessons.bpc_ex/: BPC Exercises.bpc_key/: BPC Answer Keys.ipc/: Intermediate Pāḷi Course (IPC) lessons.ipc_ex/: IPC Exercises.ipc_key/: IPC Answer Keys.
identity/: DPD CSS and JavaScript assets used for the website and document generation.scripts/: Regularly used maintenance and generation scripts (runnable withuv run).tools/: Python modules used by scripts (imports only).mkdocs.yaml: Configuration for the MkDocs static site generator.
The website is built using MkDocs with the Material for MkDocs theme. It serves as the primary way to interact with the course materials.
In addition to the static website, this project can generate high-quality PDF and Word (.docx) documents for offline study and editing. These documents are generated directly from the same Markdown source files used for the website, ensuring consistency across all formats.
To generate documents locally, you must install the following system-level dependencies:
macOS (using Homebrew):
# For PDF generation (WeasyPrint dependencies)
brew install weasyprint
# or
brew install pango libffi
# For DOCX generation (Pandoc)
brew install pandocLinux (Ubuntu/Debian):
# For PDF generation
sudo apt-get install python3-pip python3-cffi python3-brotli libpango-1.0-0 libpangoft2-1.0-0
# For DOCX generation
sudo apt-get install pandoc- Install Python Dependencies:
Ensure your local environment is up to date:
uv sync
- Run the Scripts:
The generated files will be placed in the
# Generate PDFs uv run python scripts/generate_pdfs.py # Generate DOCX uv run python scripts/generate_docx.py
pdf_exports/anddocx_exports/directories respectively.
All scripts are located in the scripts/ directory and can be run using uv run python scripts/<script_name>.py (for Python scripts) or uv run bash scripts/<script_name>.sh (for shell scripts).
Build the website locally:
./scripts/cl/pali-build-websiteGenerate PDFs and DOCX documents:
./scripts/cl/pali-build-pdf-doc-
verify_sources.py: Interactive source verification tool that compares original (old) DOCX materials against generated DOCX and PDF outputs. Helps identify discrepancies between source and generated formats.- Usage:
uv run python scripts/verify_sources.py
- Usage:
-
verify_pdf_content.py: Extracts text from generated PDFs and compares with source Markdown to ensure no data loss during PDF generation.- Usage:
uv run python scripts/verify_pdf_content.py
- Usage:
-
verify_docx_content.py: Verification tool for DOCX content integrity. Compares text extracted from generated Word documents with source Markdown.- Usage:
uv run python scripts/verify_docx_content.py
- Usage:
-
verify_numbering.py: Verifies consistency of sentence numbering (footnotes, lists) across Markdown, website, and PDF. Identifies discrepancies where numbering resets or differs between formats.- Usage:
uv run python scripts/verify_numbering.py
- Usage:
-
compare_md_sources.py: Compares current Markdown files against an older Git commit to detect potential data loss or regressions in course content.- Usage:
uv run python scripts/compare_md_sources.py [--commit <hash>]
- Usage:
-
generate_pdfs.py: Generates high-quality PDF course materials from Markdown source files using WeasyPrint. Now also generatespdf_exports/vocab.pdfandpdf_exports/abbreviations.pdf.- Usage:
uv run python scripts/generate_pdfs.py
- Usage:
-
generate_docx.py: Generates Word (.docx) documents from Markdown source using Pandoc. Maintains visual parity with PDF output for offline study. Now also generatesdocx_exports/vocab.docxanddocx_exports/abbreviations.docx.- Usage:
uv run python scripts/generate_docx.py
- Usage:
-
renumber_footnotes.py: Renumbers footnotes sequentially across all files in a course folder. The counter starts at 1 and continues across files in course order, correcting duplicate numbers and out-of-order references automatically.- Usage:
uv run python scripts/renumber_footnotes.py [--dry-run]
- Usage:
-
check_renumber.py: Detects and corrects numbering inconsistencies in Pāḷi sentence lists. Supports dry-run and automatic re-numbering of exercises and answer keys.- Usage:
uv run python scripts/check_renumber.py [--dry-run]
- Usage:
-
clean_dead_links.py: Finds and removes dead links in Markdown files. Specifically targets list items in index files that link to removed.mdfiles.- Usage:
uv run python scripts/clean_dead_links.py
- Usage:
-
fix_heading_hierarchy.py: Normalizes heading levels across all Markdown files. Converts bolded top lines to H1 headings and ensures no heading levels are skipped (e.g.,#to###becomes#to##).- Usage:
uv run python scripts/fix_heading_hierarchy.py
- Usage:
-
fixing_tables.py: Performs automated cleanup of Markdown tables. Normalizes cell padding, standardizes separator rows, and strips unnecessary bolding from footnote definitions.- Usage:
uv run python scripts/fixing_tables.py
- Usage:
-
generate_mkdocs_yaml.py: Helper script to updatemkdocs.yamlbased on course folder structure. Automatically generates the navigation section using headings from Markdown files.- Usage:
uv run python scripts/generate_mkdocs_yaml.py
- Usage:
-
generate_indexes.py: Generatesindex.mdpages for course categories, creating a Table of Contents based on individual lesson headings.- Usage:
uv run python scripts/generate_indexes.py
- Usage:
-
update_css.py: Synchronizes CSS variables from source configurations to the Identity stylesheet directory.- Usage:
uv run python scripts/update_css.py
- Usage:
-
vocab_abbrev_pali_course.py(indpd-db): Generates Markdown vocabulary and abbreviation reference pages from the DPD database. These pages are published in the "Reference" section of the website.- Usage:
cd ../dpd-db && uv run python scripts/export/vocab_abbrev_pali_course.py
- Usage:
Pre-processing scripts run a series of checks and corrections before building the website or documents:
-
web_preprocessing.sh: Runs all pre-processing steps required before building the MkDocs website (generates metadata, renumbers content, cleans links, updates CSS).- Usage:
uv run bash scripts/web_preprocessing.sh
- Usage:
-
pdf_preprocessing.sh: Runs pre-processing steps required before generating PDF and DOCX documents.- Usage:
uv run bash scripts/pdf_preprocessing.sh
- Usage:
download_all_materials.py: Downloads old source materials from Google Docs as a ZIP archive. Facilitates keeping Markdown source files in sync with original (old) sources if needed (for reference/backup purposes).- Usage:
uv run python scripts/download_all_materials.py
- Usage:
This project uses uv for Python dependency management.
-
Install uv: Follow the instructions at astral.sh/uv.
-
Install Dependencies:
uv sync
-
Build and Serve Website Locally: We recommend using the included unified build script, which handles metadata generation, renumbering, and starting the local server:
./scripts/cl/pali-build-website
To run this from anywhere on your system, add
scripts/cl/to your PATH (e.g.,fish_add_path /path/to/dpd-pali-courses/scripts/clin Fish).The site will be available at
http://127.0.0.1:8000. -
Generate Documents Locally:
uv run python scripts/generate_pdfs.py uv run python scripts/generate_docx.py
The website, PDF volumes, and DOCX volumes are automatically updated whenever changes are pushed to the main branch.
- Website Deployment: Handled by
.github/workflows/deploy_site.yaml. - Document Generation: Handled by a unified workflow that generates both PDF and DOCX artifacts and publishes them to the latest GitHub Release.
Note: These Google Docs are the original sources. They will be removed once the Markdown conversion and website are fully verified.