Emoji Stats
Explore statistics about the Unicode emoji set — category distribution, version growth, type breakdown.
CheckerEmojis by Category
Emojis by Type
Emojis Added per Version
How to Use
-
1
Select a statistics category
Choose from breakdowns by Unicode group, emoji version, modifier support, ZWJ sequence type, or platform coverage. Each view offers a different analytical lens on the Unicode emoji dataset.
-
2
Explore charts and counts
Review the distribution charts showing how emoji are distributed across groups, how many were added per Unicode version, and what percentage support skin tone modifiers. Hover over chart segments for exact counts and percentage breakdowns.
-
3
Drill into specific subsets
Click any chart segment or table row to filter the full emoji list to that subset, allowing you to browse all emojis matching the selected criterion alongside their codepoints and Unicode version data.
About
The Unicode emoji dataset is a richly structured corpus with multiple analytical dimensions. At the top level, emoji are grouped into nine categories in emoji-test.txt, but these groups are merely organizational conventions — the normative metadata lives in emoji-data.txt, which assigns each codepoint properties like Emoji, Emoji_Presentation, Emoji_Modifier, Emoji_Modifier_Base, and Emoji_Component. These properties govern how emoji behave in text processing, input methods, and rendering engines.
From a growth perspective, the emoji set has expanded from the original 722 emoji in Unicode 6.0 (2010) — largely imported from Japanese carrier encodings — to nearly 4,000 in Emoji 16.0. Each annual release cycle involves formal proposals submitted to the Unicode Emoji Subcommittee, review by the Unicode Technical Committee, and a beta period for public comment. Proposals are evaluated against criteria including expected usage frequency, distinctiveness from existing emoji, and breadth of appeal across cultures and geographies. Many proposed emoji are rejected on the grounds of excessive similarity to existing characters or insufficient evidence of widespread demand.
Sequence emoji — including skin tone modifier sequences, ZWJ sequences, keycap sequences, and flag sequences — account for a significant portion of the emoji catalog but share relatively few base codepoints. This architectural efficiency means that adding a new base emoji character can enable many derived sequences. Statistical analysis of the emoji dataset reveals patterns in design: the 'People & Body' group is the largest due to the combinatorial explosion of skin tone and gender variants, while 'Flags' forms a distinct cluster defined almost entirely by sequence mechanics rather than individual glyph design.