A library and command-line tool for working with PDF interactive forms. It can:
- Describe the available fields in the form
- Convert the PDF to an HTML form
- Populate the PDF form with data
Uses Pikepdf.
The pdform describe command can be used to get information about a PDF form, such as the names and types of fields, allowable options, and so forth. Use pdform describe --help for command-line options.
pdform describe form.pdfBy default, it will show every field in the form, together with all the relevant details about the field. Command-line options exist to filter this view for easier parsing.
=========================================================================
stream <_io.BufferedReader name='../../pikepdf/tests/resources/form.pdf'>
=========================================================================
Text1
-----
Label:
Text1
Type:
TextField
Required:
No
Read Only:
No
Multiline:
No
Max Length:
None
Default Value:
Current Value:
... and so on ...
Filling forms is done using the pdform fill-form command. Typically, this will be done using JSON-formatted data, such as:
{
"TextField1": "Some Text",
"Checkbox1": true,
"RadioGroup1": "3",
"ChoiceField1": "Option 4",
"SignatureField": "/home/myself/signature.png"
}You can then call the command with this JSON:
pdform fill-form template.pdf output.pdf data.jsonOr pipe this JSON into the command:
echo {your json here} | pdform fill-form template.pdf output.pdf -Converting to HTML relies on pdf2htmlEX to generate the initial HTML. We then use BeautifulSoup to strip away most of the unnessesary code, and add the form fields.
This function can be activated in one of two ways:
- The
pdform make-htmlcommand. Usepdform make-html --helpfor details. - Directly via Python.
The command-line interface is sufficient for basic usage. It provides a handful of different output formats: plain HTML, Jinja, and PHP.
pdform make-html --jinja input.pdf output.jinjaHowever, it is likely you may wish to customize the rendered HTML. The Python interfaces gives much more flexibility for this.
from pdform.make_html import FieldRenderer, make_html
# Define your own field renderer to control the emitted code for form fields
class MyFieldRenderer(FieldRenderer):
# See the source code for details on how to implement this class
...
soup = make_html(path, field_renderer_class=MyFieldRenderer)
# Use the BeautifulSoup object to perform any post-processing to the generated HTML
# (See the BeautifulSoup documentation for how to use it to manipulate the DOM)
...
# Output the rendered HTML
print(soup.prettify())