jsoup

A Python library that converts JSON structures into BeautifulSoup HTML/XML trees. The inverse of bs2json — build HTML from dictionaries with full support for attributes, comments, doctypes, and nested elements.

Python 3.8+ | Only dependency: beautifulsoup4

Table of Contents

Section	Description
Installation	How to install
Quick Start	Basic usage
Input Format	How JSON maps to HTML
Features	Attributes, lists, comments, empty elements, doctypes
bs2json Roundtrip	Using bs2json output as jsoup input
Options	Custom labels, duplicate attributes, char refs
API Reference	JsonTreeBuilder, install()
Contributing	How to contribute

Installation

pip install -U jsoup

Quick Start

from jsoup import JsonTreeBuilder
from bs4 import BeautifulSoup

json = {
    "body": {
        "h1": {"attrs": {"class": "title"}, "text": "Hello World"},
        "p": "This is a paragraph.",
        "br": None,
        "ul": {
            "li": ["Item 1", "Item 2", "Item 3"]
        }
    }
}

soup = BeautifulSoup(json, builder=JsonTreeBuilder)
print(soup.prettify())

Output:

<body>
 <h1 class="title">
  Hello World
 </h1>
 <p>
  This is a paragraph.
 </p>
 <br/>
 <ul>
  <li>Item 1</li>
  <li>Item 2</li>
  <li>Item 3</li>
 </ul>
</body>

Input Format

JSON	HTML
`{"p": "text"}`	`<p>text</p>`
`{"br": None}`	`<br/>`
`{"p": {"attrs": {"class": "x"}, "text": "hello"}}`	`<p class="x">hello</p>`
`{"li": ["a", "b", "c"]}`	`<li>a</li><li>b</li><li>c</li>`
`{"comment": "note"}`	`<!--note-->`
`{"doctype": "html"}`	`<!DOCTYPE html>`
`{"div": {"children": [{"p": "a"}, {"p": "b"}]}}`	`<div><p>a</p><p>b</p></div>`

Features

Attributes

Attributes are passed via the attrs key:

json = {
    "a": {"attrs": {"href": "/home", "class": "nav"}, "text": "Home"},
    "img": {"attrs": {"src": "photo.jpg", "alt": "Photo"}}
}

Produces:

<a class="nav" href="/home">Home</a>
<img alt="Photo" src="photo.jpg"/>

Lists (Multiple Same Tags)

A list value creates multiple tags with the same name:

json = {"ul": {"li": ["Apple", "Banana", "Cherry"]}}

Produces:

<ul><li>Apple</li><li>Banana</li><li>Cherry</li></ul>

List items can also be dicts with nested content:

json = {"ul": {"li": [
    "Simple item",
    {"text": "Item with link", "a": {"attrs": {"href": "/"}, "text": "click"}}
]}}

Comments

json = {
    "body": {
        "comment": "This is a comment",
        "p": "Visible text"
    }
}
# Produces: <!--This is a comment--><p>Visible text</p>

Empty Elements

Use None for self-closing tags:

json = {"body": {"br": None, "hr": None}}
# Produces: <body><br/><hr/></body>

Doctypes

json = {
    "doctype": "html",
    "html": {"body": {"p": "content"}}
}

Nested Structures

Nesting works naturally:

json = {
    "html": {
        "head": {"title": "My Page"},
        "body": {
            "header": {
                "nav": {"ul": {"li": [
                    {"a": {"attrs": {"href": "/"}, "text": "Home"}},
                    {"a": {"attrs": {"href": "/about"}, "text": "About"}}
                ]}}
            },
            "main": {"h1": "Welcome", "p": "Content here"},
            "footer": {"p": "Copyright 2026"}
        }
    }
}

bs2json Roundtrip

jsoup understands the children key from bs2json's ordered output, enabling roundtrip conversion:

from bs2json import BS2Json
from bs4 import BeautifulSoup
from jsoup import JsonTreeBuilder

# HTML -> JSON (bs2json)
html = "<html><body><h1>Title</h1><p>Text</p><h1>Another</h1></body></html>"
json_data = BS2Json(html).convert()
# {'html': {'body': {'children': [{'h1': 'Title'}, {'p': 'Text'}, {'h1': 'Another'}]}}}

# JSON -> HTML (jsoup)
soup = BeautifulSoup(json_data, builder=JsonTreeBuilder)
print(soup.prettify())
# <html><body><h1>Title</h1><p>Text</p><h1>Another</h1></body></html>

The children key preserves element order, including elements with attributes:

json = {
    "table": {
        "attrs": {"id": "data"},
        "children": [
            {"tr": {"children": [{"th": "Name"}, {"th": "Score"}]}},
            {"tr": {"children": [{"td": "Alice"}, {"td": "95"}]}}
        ]
    }
}

Options

Using install() for Cleaner Syntax

Register jsoup so you can use "jsoup" as a parser string:

from jsoup import install
install()

from bs4 import BeautifulSoup
soup = BeautifulSoup({"p": "hello"}, "jsoup")

Custom Label Names

Override the default key names for attributes, text, and children:

json = {"p": {"@": {"class": "x"}, "#text": "hello"}}
soup = BeautifulSoup(json, builder=JsonTreeBuilder,
                     attr_name='@', text_name='#text')
# <p class="x">hello</p>

Duplicate Attributes

Control how duplicate attribute keys are handled when attrs is a list of dicts:

json = {"p": {"attrs": [{"class": "a"}, {"class": "b"}], "text": "hello"}}

# Replace (default): last value wins
soup = BeautifulSoup(json, builder=JsonTreeBuilder, on_duplicate_attribute="replace")

# Ignore: first value wins
soup = BeautifulSoup(json, builder=JsonTreeBuilder, on_duplicate_attribute="ignore")

# Callable: custom merge logic
def merge(attrs, name, value):
    attrs[name] += " " + value

soup = BeautifulSoup(json, builder=JsonTreeBuilder, on_duplicate_attribute=merge)

Character References

HTML entities are escaped automatically:

json = {"p": "1<2 && 2>1"}
soup = BeautifulSoup(json, builder=JsonTreeBuilder)
# <p>1&lt;2 &amp;&amp; 2&gt;1</p>

API Reference

JsonTreeBuilder

A BeautifulSoup TreeBuilder that accepts JSON dicts as input.

from jsoup import JsonTreeBuilder
soup = BeautifulSoup(json_data, builder=JsonTreeBuilder, **options)

Options (passed as kwargs to BeautifulSoup):

Option	Default	Description
`attr_name`	`"attrs"`	JSON key for element attributes
`text_name`	`"text"`	JSON key for text content
`children_name`	`"children"`	JSON key for ordered children list
`on_duplicate_attribute`	`"replace"`	How to handle duplicate attrs: `"replace"`, `"ignore"`, or callable
`convert_charref`	`True`	Whether to escape HTML entities

install()

Register JsonTreeBuilder so "jsoup" can be used as a parser string:

from jsoup import install
install(debug=False)

After calling install():

soup = BeautifulSoup(json_data, "jsoup")

Contributing

See CONTRIBUTING.md for development setup, versioning guide, and how to submit changes.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
examples		examples
jsoup		jsoup
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jsoup

Installation

Quick Start

Input Format

Features

bs2json Roundtrip

Options

API Reference

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

jsoup

Installation

Quick Start

Input Format

Features

bs2json Roundtrip

Options

API Reference

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages