A Python library that converts JSON structures into BeautifulSoup HTML/XML trees. The inverse of bs2json — build HTML from dictionaries with full support for attributes, comments, doctypes, and nested elements.
Python 3.8+ | Only dependency: beautifulsoup4
Table of Contents
| Section | Description |
|---|---|
| Installation | How to install |
| Quick Start | Basic usage |
| Input Format | How JSON maps to HTML |
| Features | Attributes, lists, comments, empty elements, doctypes |
| bs2json Roundtrip | Using bs2json output as jsoup input |
| Options | Custom labels, duplicate attributes, char refs |
| API Reference | JsonTreeBuilder, install() |
| Contributing | How to contribute |
pip install -U jsoupfrom jsoup import JsonTreeBuilder
from bs4 import BeautifulSoup
json = {
"body": {
"h1": {"attrs": {"class": "title"}, "text": "Hello World"},
"p": "This is a paragraph.",
"br": None,
"ul": {
"li": ["Item 1", "Item 2", "Item 3"]
}
}
}
soup = BeautifulSoup(json, builder=JsonTreeBuilder)
print(soup.prettify())Output:
<body>
<h1 class="title">
Hello World
</h1>
<p>
This is a paragraph.
</p>
<br/>
<ul>
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
</ul>
</body>| JSON | HTML |
|---|---|
{"p": "text"} |
<p>text</p> |
{"br": None} |
<br/> |
{"p": {"attrs": {"class": "x"}, "text": "hello"}} |
<p class="x">hello</p> |
{"li": ["a", "b", "c"]} |
<li>a</li><li>b</li><li>c</li> |
{"comment": "note"} |
<!--note--> |
{"doctype": "html"} |
<!DOCTYPE html> |
{"div": {"children": [{"p": "a"}, {"p": "b"}]}} |
<div><p>a</p><p>b</p></div> |
Attributes
Attributes are passed via the attrs key:
json = {
"a": {"attrs": {"href": "/home", "class": "nav"}, "text": "Home"},
"img": {"attrs": {"src": "photo.jpg", "alt": "Photo"}}
}Produces:
<a class="nav" href="/home">Home</a>
<img alt="Photo" src="photo.jpg"/>Lists (Multiple Same Tags)
A list value creates multiple tags with the same name:
json = {"ul": {"li": ["Apple", "Banana", "Cherry"]}}Produces:
<ul><li>Apple</li><li>Banana</li><li>Cherry</li></ul>List items can also be dicts with nested content:
json = {"ul": {"li": [
"Simple item",
{"text": "Item with link", "a": {"attrs": {"href": "/"}, "text": "click"}}
]}}Comments
json = {
"body": {
"comment": "This is a comment",
"p": "Visible text"
}
}
# Produces: <!--This is a comment--><p>Visible text</p>Empty Elements
Use None for self-closing tags:
json = {"body": {"br": None, "hr": None}}
# Produces: <body><br/><hr/></body>Doctypes
json = {
"doctype": "html",
"html": {"body": {"p": "content"}}
}Nested Structures
Nesting works naturally:
json = {
"html": {
"head": {"title": "My Page"},
"body": {
"header": {
"nav": {"ul": {"li": [
{"a": {"attrs": {"href": "/"}, "text": "Home"}},
{"a": {"attrs": {"href": "/about"}, "text": "About"}}
]}}
},
"main": {"h1": "Welcome", "p": "Content here"},
"footer": {"p": "Copyright 2026"}
}
}
}jsoup understands the children key from bs2json's ordered output, enabling roundtrip conversion:
from bs2json import BS2Json
from bs4 import BeautifulSoup
from jsoup import JsonTreeBuilder
# HTML -> JSON (bs2json)
html = "<html><body><h1>Title</h1><p>Text</p><h1>Another</h1></body></html>"
json_data = BS2Json(html).convert()
# {'html': {'body': {'children': [{'h1': 'Title'}, {'p': 'Text'}, {'h1': 'Another'}]}}}
# JSON -> HTML (jsoup)
soup = BeautifulSoup(json_data, builder=JsonTreeBuilder)
print(soup.prettify())
# <html><body><h1>Title</h1><p>Text</p><h1>Another</h1></body></html>The children key preserves element order, including elements with attributes:
json = {
"table": {
"attrs": {"id": "data"},
"children": [
{"tr": {"children": [{"th": "Name"}, {"th": "Score"}]}},
{"tr": {"children": [{"td": "Alice"}, {"td": "95"}]}}
]
}
}Using install() for Cleaner Syntax
Register jsoup so you can use "jsoup" as a parser string:
from jsoup import install
install()
from bs4 import BeautifulSoup
soup = BeautifulSoup({"p": "hello"}, "jsoup")Custom Label Names
Override the default key names for attributes, text, and children:
json = {"p": {"@": {"class": "x"}, "#text": "hello"}}
soup = BeautifulSoup(json, builder=JsonTreeBuilder,
attr_name='@', text_name='#text')
# <p class="x">hello</p>Duplicate Attributes
Control how duplicate attribute keys are handled when attrs is a list of dicts:
json = {"p": {"attrs": [{"class": "a"}, {"class": "b"}], "text": "hello"}}
# Replace (default): last value wins
soup = BeautifulSoup(json, builder=JsonTreeBuilder, on_duplicate_attribute="replace")
# Ignore: first value wins
soup = BeautifulSoup(json, builder=JsonTreeBuilder, on_duplicate_attribute="ignore")
# Callable: custom merge logic
def merge(attrs, name, value):
attrs[name] += " " + value
soup = BeautifulSoup(json, builder=JsonTreeBuilder, on_duplicate_attribute=merge)Character References
HTML entities are escaped automatically:
json = {"p": "1<2 && 2>1"}
soup = BeautifulSoup(json, builder=JsonTreeBuilder)
# <p>1<2 && 2>1</p>JsonTreeBuilder
A BeautifulSoup TreeBuilder that accepts JSON dicts as input.
from jsoup import JsonTreeBuilder
soup = BeautifulSoup(json_data, builder=JsonTreeBuilder, **options)Options (passed as kwargs to BeautifulSoup):
| Option | Default | Description |
|---|---|---|
attr_name |
"attrs" |
JSON key for element attributes |
text_name |
"text" |
JSON key for text content |
children_name |
"children" |
JSON key for ordered children list |
on_duplicate_attribute |
"replace" |
How to handle duplicate attrs: "replace", "ignore", or callable |
convert_charref |
True |
Whether to escape HTML entities |
install()
Register JsonTreeBuilder so "jsoup" can be used as a parser string:
from jsoup import install
install(debug=False)After calling install():
soup = BeautifulSoup(json_data, "jsoup")See CONTRIBUTING.md for development setup, versioning guide, and how to submit changes.