This is a minimalistic implementation for CBOR, the Concise Binary Object Representation. CBOR is defined by IETF RFC 8949, and Wikipedia has a good description.
- C99
- No dynamic memory allocation
- Small code footprint
CBOR_ROOT ?= <THIRD_PARTY_DIR>/cbor
include $(CBOR_ROOT)/cbor.mkinclude(FetchContent)
FetchContent_Declare(cbor
GIT_REPOSITORY https://github.com/libmcu/cbor.git
GIT_TAG main
)
FetchContent_MakeAvailable(cbor)or
set(CBOR_ROOT <THIRD_PARTY_DIR>/cbor)
include(${CBOR_ROOT}/cbor.cmake)Add the module to your west manifest:
# west.yml
manifest:
projects:
- name: cbor
url: https://github.com/libmcu/cbor.git
revision: main
path: modules/lib/cborEnable in prj.conf:
CONFIG_LIBMCU_CBOR=y
Clone or add as a git submodule under your project's components/ directory:
cd components
git submodule add https://github.com/libmcu/cbor.git cborNo wrapper CMakeLists.txt is needed. The root CMakeLists.txt supports both
plain CMake and ESP-IDF component contexts. ESP-IDF detection uses
COMMAND idf_component_register, so builds that define ESP_PLATFORM without
entering a real component context still fall back to the freestanding CMake
path.
cbor_unmarshal() dispatches parsed CBOR nodes to registered callbacks by
matching each node's position in the tree against a declared path. A path is
an ordered list of struct cbor_path_segment values—one per nesting level—built with
the CBOR_STR_SEG, CBOR_INT_SEG, and CBOR_IDX_SEG helpers and wrapped
into a cbor_parser via the CBOR_PATH macro.
static void parse_cert(const cbor_reader_t *reader,
const struct cbor_parser *parser,
const cbor_item_t *item, void *arg) {
struct your_data_type *out = arg;
cbor_decode(reader, item, out->cert, sizeof(out->cert));
}
static void parse_key(const cbor_reader_t *reader,
const struct cbor_parser *parser,
const cbor_item_t *item, void *arg) {
struct your_data_type *out = arg;
cbor_decode(reader, item, out->key, sizeof(out->key));
}
static const struct cbor_path_segment path_cert[] = { CBOR_STR_SEG("certificate") };
static const struct cbor_path_segment path_key[] = { CBOR_STR_SEG("privateKey") };
static const struct cbor_parser parsers[] = {
CBOR_PATH(path_cert, parse_cert),
CBOR_PATH(path_key, parse_key),
};
cbor_reader_t reader;
cbor_item_t items[MAX_ITEMS];
cbor_reader_init(&reader, items, sizeof(items) / sizeof(*items));
cbor_unmarshal(&reader, parsers, sizeof(parsers) / sizeof(*parsers),
msg, msglen, &your_data_type);Paths up to CBOR_RECURSION_MAX_LEVEL deep are supported (default 8). Each
segment specifies one key in the nesting hierarchy. All three key kinds can be
mixed freely within a single path.
| Macro | Matches |
|---|---|
CBOR_STR_SEG("key") |
map entry whose CBOR string key (text or byte string) equals "key" (string literal only) |
CBOR_INT_SEG(n) |
map entry whose integer key equals n |
CBOR_IDX_SEG(n) |
the n-th element (0-based) of an enclosing array |
CBOR_ANY_SEG() |
any map value whose key is a string or integer, or any array element at this depth (wildcard; map values under other key types are skipped; does not match map keys) |
Example — same key "id" appears under two different parents:
static const struct cbor_path_segment path_dev_ba[] = {
CBOR_STR_SEG("dev"), CBOR_STR_SEG("id")
};
static const struct cbor_path_segment path_prod_ba[] = {
CBOR_STR_SEG("prod"), CBOR_STR_SEG("id")
};
static const struct cbor_parser parsers[] = {
CBOR_PATH(path_dev_ba, on_dev_ba_id),
CBOR_PATH(path_prod_ba, on_prod_ba_id),
};Example — array element addressed by index (depth 3):
static const struct cbor_path_segment path_item0_name[] = {
CBOR_STR_SEG("items"), CBOR_IDX_SEG(0), CBOR_STR_SEG("name")
};Please refer to examples for complete runnable code including depth-4 nested maps, container callbacks, and mixed string/integer/index paths.
CBOR_BIG_ENDIAN- Define the macro for big endian machine. The default is little endian.
CBOR_RECURSION_MAX_LEVEL- This is set to avoid stack overflow from recursion. The default maximum depth is 8.
The parser takes 626 bytes on ARM Cortex-M0 optimizing for code size -Os.
arm-none-eabi-gcc
10-2020-q4-major
was used for the check.
Stack usage per the major type functions:
| Major type | Bytes |
|---|---|
| 0: unsigned integer | 12 |
| 1: negative integer | 12 |
| 2: byte string | 32 |
| 3: text string | 32 |
| 4: array | 32 |
| 5: map | 32 |
| 6: tag | 32 |
| 7: floating-point numbers, simple values and break | 32 |
And the call stack for each recursion is 24 bytes.
cbor_reader_t reader;
cbor_item_t items[MAX_ITEMS];
size_t n;
cbor_reader_init(&reader, items, sizeof(items) / sizeof(*items));
cbor_parse(&reader, cbor_message, cbor_message_len, &n);
for (size_t i = 0; i < n; i++) {
printf("item: %s, size: %zu\n",
cbor_stringify_item(&items[i]),
cbor_get_item_size(&items[i]));
}MAX_ITEMS is a capacity in number of parsed items, not bytes. The memory
footprint of this array is:
size_t item_buffer_bytes = MAX_ITEMS * sizeof(cbor_item_t);sizeof(cbor_item_t) is platform dependent (e.g. typically 12 bytes on many
32-bit MCUs, and often 24 bytes on 64-bit hosts).
Each cbor_item_t represents one parsed CBOR node (container, tag, and leaf
all count as one item). cbor_get_item_size() unit depends on the item type:
- Integer / float / simple value: encoded payload size in bytes
- Byte/text string (definite-length): string length in bytes
- Array (definite-length): number of child items
- Map (definite-length): number of key-value pairs
- Tag: tag number (e.g.
1for epoch-based date/time)
For indefinite-length strings, arrays, and maps, cbor_get_item_size()
returns the raw CBOR length field, which is the sentinel
(size_t)CBOR_INDEFINITE_VALUE. In that case, the number of child items / key-
value pairs is not known up front; iterate until the CBOR BREAK marker is
encountered instead of using cbor_get_item_size() as a count.
Example (0xA2 0x61 0x61 0x01 0x61 0x62 0x02, equivalent to
{"a": 1, "b": 2}):
- 1 map item
- 2 key items (
"a","b") - 2 value items (
1,2) - total parsed item count (
n) = 5
When a value is tagged, the tag occupies its own item slot immediately before
the wrapped item. For {"a": tag(1, 1234567890)}:
- 1 map item
- 1 key item (
"a") - 1 tag item (tag number =
1,cbor_get_item_size()returns1) - 1 integer item (
1234567890) - total parsed item count (
n) = 4
Use cbor_count_items() for dry-run counting without allocating cbor_item_t
buffer:
size_t needed_items = 0;
cbor_error_t err = cbor_count_items(cbor_message, cbor_message_len,
&needed_items);
if (err == CBOR_SUCCESS) {
/* allocate items[needed_items + margin] */
}In practice, use one of the following:
- Reserve a conservative static capacity based on your schema, or
- Call
cbor_count_items()first, then set productionMAX_ITEMSwith margin.
If items[] is too small to hold all parsed nodes, cbor_parse() returns
CBOR_OVERRUN (stringified as "out of memory").
cbor_error_t err = cbor_parse(&reader, msg, msglen, &n);
if (err == CBOR_OVERRUN) {
/* items[] capacity is insufficient: increase maxitems */
}In this case, n contains only the number of items actually stored before the
overrun.
The streaming decoder processes CBOR byte-by-byte (or in any chunk size) without requiring the full message up front. It issues events via a callback as each item is decoded, so it works on constrained systems where the complete payload may not fit in RAM at once.
#include "cbor/stream.h"
static bool on_event(const cbor_stream_event_t *event,
const cbor_stream_data_t *data, void *arg)
{
switch (event->type) {
case CBOR_STREAM_EVENT_UINT:
printf("uint: %llu\n", (unsigned long long)data->uint);
break;
case CBOR_STREAM_EVENT_INT:
printf("int: %lld\n", (long long)data->sint);
break;
case CBOR_STREAM_EVENT_TEXT:
printf("text[%zu/%lld]: %.*s\n",
data->str.len, (long long)data->str.total,
(int)data->str.len,
data->str.ptr ? (const char *)data->str.ptr : "");
break;
case CBOR_STREAM_EVENT_BYTES:
/* data->str.ptr points into the caller's buffer, or is NULL when len == 0 */
break;
case CBOR_STREAM_EVENT_ARRAY_START:
printf("array(%lld) depth=%u\n",
(long long)data->container.size, event->depth);
break;
case CBOR_STREAM_EVENT_ARRAY_END:
break;
case CBOR_STREAM_EVENT_MAP_START:
printf("map(%lld) depth=%u key=%d\n",
(long long)data->container.size,
event->depth, event->is_map_key);
break;
case CBOR_STREAM_EVENT_MAP_END:
break;
case CBOR_STREAM_EVENT_TAG:
printf("tag(%llu)\n", (unsigned long long)data->tag);
break;
case CBOR_STREAM_EVENT_FLOAT:
printf("float: %g\n", data->flt);
break;
case CBOR_STREAM_EVENT_BOOL:
printf("bool: %s\n", data->boolean ? "true" : "false");
break;
case CBOR_STREAM_EVENT_NULL:
printf("null\n");
break;
default:
break;
}
return true; /* return false to abort decoding */
}
cbor_stream_decoder_t decoder;
cbor_stream_init(&decoder, on_event, NULL);
/* Feed any number of chunks — partial items are held in decoder state */
cbor_error_t err = cbor_stream_feed(&decoder, chunk, chunk_len);
if (err != CBOR_SUCCESS) {
/* handle error */
}
/* After the last chunk, verify the stream ended on a clean boundary */
err = cbor_stream_finish(&decoder);Feeding can be split across as many calls as needed:
/* byte-at-a-time — fully supported */
for (size_t i = 0; i < msg_len; i++) {
cbor_stream_feed(&decoder, &msg[i], 1);
}
cbor_stream_finish(&decoder);To reuse the decoder after an error, call cbor_stream_reset():
if (cbor_stream_feed(&decoder, bad_data, len) != CBOR_SUCCESS) {
cbor_stream_reset(&decoder); /* clears error, preserves callback */
}| Event | data field |
Notes |
|---|---|---|
CBOR_STREAM_EVENT_UINT |
data->uint |
unsigned integer |
CBOR_STREAM_EVENT_INT |
data->sint |
negative integer |
CBOR_STREAM_EVENT_BYTES / _TEXT |
data->str |
ptr into caller's buffer, or NULL when len == 0; indefinite strings end with a final zero-length BREAK chunk carrying last=true |
CBOR_STREAM_EVENT_ARRAY_START / _END |
data->container.size (-1 if indefinite) / NULL |
container open/close |
CBOR_STREAM_EVENT_MAP_START / _END |
data->container.size (-1 if indefinite) / NULL |
event->is_map_key tracks key/value position |
CBOR_STREAM_EVENT_TAG |
data->tag |
emitted before the wrapped item's event(s) |
CBOR_STREAM_EVENT_FLOAT |
data->flt |
half/single/double all promoted to double |
CBOR_STREAM_EVENT_BOOL |
data->boolean |
|
CBOR_STREAM_EVENT_NULL / _UNDEFINED |
(unused) | callback receives a non-NULL data pointer, but no payload field is used |
CBOR_STREAM_EVENT_SIMPLE |
data->simple |
unassigned simple value |
| Code | Meaning |
|---|---|
CBOR_SUCCESS |
all bytes consumed cleanly |
CBOR_NEED_MORE |
finish() called with an incomplete item or open container |
CBOR_ILLEGAL |
reserved additional-info byte; malformed encoding |
CBOR_INVALID |
well-formed but semantically invalid (e.g. negative integer overflows int64) |
CBOR_EXCESSIVE |
nesting deeper than CBOR_RECURSION_MAX_LEVEL or tag nesting beyond CBOR_STREAM_MAX_PENDING_TAGS |
CBOR_ABORTED |
callback returned false |
After any non-CBOR_SUCCESS return produced while operating on a valid decoder
instance, the decoder is in a sticky error state. Subsequent feed() /
finish() calls return the same error immediately until you call
cbor_stream_reset().
API-level argument validation failures (for example NULL decoder/callback, or
NULL data with non-zero length) may also return CBOR_INVALID, but they do
not modify decoder state and therefore are not sticky.
You can estimate item count from CBOR bytes with:
python3 tools/cbor_item_count.py --hex "A2 61 61 01 61 62 02"or from file:
python3 tools/cbor_item_count.py --file ./message.cborIf your build overrides CBOR_RECURSION_MAX_LEVEL, pass the same depth to the
script so it matches parser behavior:
python3 tools/cbor_item_count.py --file ./message.cbor --max-depth 16The script prints parser-compatible status (CBOR_SUCCESS, CBOR_BREAK,
CBOR_ILLEGAL, etc.) and counted item number.
union {
int8_t i8;
int16_t i16;
int32_t i32;
int64_t i64;
float f32;
double f64;
uint8_t s[MTU];
} val;
cbor_decode(&reader, &items[i], &val, sizeof(val));cbor_writer_t writer;
cbor_writer_init(&writer, buf, sizeof(buf));
cbor_encode_map(&writer, 2);
/* 1st */
cbor_encode_text_string(&writer, "key", 3);
cbor_encode_text_string(&writer, "value", 5);
/* 2nd */
cbor_encode_text_string(&writer, "age", 3);
cbor_encode_negative_integer(&writer, -1);Call cbor_encode_tag() immediately before the item it wraps. The caller is
responsible for ensuring the tag and its wrapped item are written in order — the
same convention used by cbor_encode_array() and cbor_encode_map().
/* tag(1, 1234567890) — epoch-based date/time */
cbor_encode_tag(&writer, 1);
cbor_encode_unsigned_integer(&writer, 1234567890);
/* tag(0, "2006-01-13T20:11:12Z") — date/time string */
cbor_encode_tag(&writer, 0);
cbor_encode_text_string(&writer, "2006-01-13T20:11:12Z", 20);
/* Nested tags: tag(1, tag(2, 0)) */
cbor_encode_tag(&writer, 1);
cbor_encode_tag(&writer, 2);
cbor_encode_unsigned_integer(&writer, 0);The parser records each tag as a CBOR_ITEM_TAG item. Use
cbor_get_tag_number() to read the tag number; the item immediately following
it in the items[] array is the wrapped value.
/* tag(1, 1234567890) — epoch-based date/time (RFC 8949 §3.4.2)
* Encoded: 0xC1 0x1A 0x49 0x96 0x02 0xD2 */
uint8_t msg[] = { 0xC1, 0x1A, 0x49, 0x96, 0x02, 0xD2 };
cbor_reader_t reader;
cbor_item_t items[4];
size_t n;
cbor_reader_init(&reader, items, sizeof(items) / sizeof(*items));
cbor_parse(&reader, msg, sizeof(msg), &n); /* n == 2 */
/* items[0]: CBOR_ITEM_TAG, cbor_get_tag_number() == 1 */
/* items[1]: CBOR_ITEM_INTEGER, value == 1234567890 */
if (items[0].type == CBOR_ITEM_TAG) {
cbor_tag_t tag = cbor_get_tag_number(&items[0]);
uint32_t timestamp = 0;
cbor_decode(&reader, &items[1], ×tamp, sizeof(timestamp));
/* tag == 1, timestamp == 1234567890 */
}With cbor_iterate(), CBOR_ITEM_TAG items are delivered to the callback
before the wrapped item, so the tag number can be inspected in context:
static void on_item(const cbor_reader_t *reader,
const cbor_item_t *item, const cbor_item_t *parent, void *arg)
{
if (item->type == CBOR_ITEM_TAG) {
printf("tag %zu\n", cbor_get_tag_number(item));
return;
}
int64_t val = 0;
cbor_decode(reader, item, &val, sizeof(val));
printf("value %lld\n", (long long)val);
}cbor_unmarshal() treats tags as transparent: the parser callback registered
for a map key receives the wrapped value item, not the tag item, regardless of
how many tags are stacked.
- The maximum item length is
size_tbecause the interface return type issize_t. The argument's value in the specification can go up touint64_tthough - A negative integer ranges down to -2^63-1 other than -2^64 in the specification
- Sorting of encoded map keys is not supported
- On 32-bit targets where
size_tis 32 bits, tag numbers above2^32 - 1cannot be represented incbor_item_t.size;cbor_parse()returnsCBOR_INVALIDfor such values cbor_unmarshal()supports map (string or integer keys) and array (index) path segments; if the CBOR message root is neither a map nor an array, no path segment is pushed and only depth-0 parsers will matchcbor_unmarshal()may dispatch matching MAP and ARRAY container nodes in addition to scalar values and indefinite-length strings. If the root item is a MAP or ARRAY, a depth-0 parser can match and receive that root containercbor_dispatch()dispatches from the root whencontainer == NULL; when a non-NULL container is given, it must point to an item stored inreader->itemsfrom the same reader and only that container's children are dispatched