Collection Builder

Build and export custom emoji collections. Create themed sets for projects, presentations, or social media.

Generator
๐Ÿ“ฆ

Your collection is empty. Search above to add emojis.

Saved Collections

How to Use

  1. 1
    Search and add emoji to your set

    Use the search bar or category browser to find emoji, then click the plus button to add each one to your collection. Collections can contain up to any number of emoji and can be reordered by drag-and-drop.

  2. 2
    Name and organize your collection

    Give your collection a descriptive name and optionally add labels or group emoji into subcategories. Collections are stored in your browser's local storage or can be exported as a JSON file for use in other tools or code.

  3. 3
    Export in your preferred format

    Export the collection as a JSON array of codepoints, a newline-separated text file, a JavaScript array, a Python list, or a copy-all string. Each export format includes the emoji characters and their Unicode codepoints for downstream processing.

About

Building and managing custom emoji collections bridges two domains: the Unicode character model, which defines the canonical identity of each emoji, and application data design, which determines how collections are stored, transmitted, and rendered. At the Unicode level, each emoji in a collection is best represented by its fully-qualified codepoint sequence as defined in emoji-test.txt โ€” this ensures portability across tools, languages, and platforms. Storing emoji as raw Unicode strings (rather than shortcodes or platform-specific IDs) makes collections language-agnostic and future-proof against shortcode dictionary changes.

For web applications, rendering a custom emoji collection requires choosing between three approaches: relying on OS system fonts (zero dependencies, but platform-specific glyphs), using a bundled emoji font like Noto Color Emoji (consistent rendering, but adds ~10MB to bundle size), or serving individual emoji as SVG/PNG assets from a library like Twemoji (granular control, best for performance at scale using lazy loading). Each approach has different implications for visual consistency, load performance, and maintenance overhead as new emoji versions are released.

Emoji collection use cases range from content moderation allow/block lists to custom keyboard subsets for specialized applications (children's apps, accessibility tools, domain-specific communication platforms). The Unicode emoji-data.txt file provides metadata useful for filtering collections by property โ€” for example, building a collection of only skin-tone-modifiable emoji (those with the Emoji_Modifier_Base property) or only fully-qualified sequences. The CLDR annotation dataset adds searchability across languages, enabling multilingual collection interfaces without maintaining separate keyword databases per locale.

FAQ

What is the best way to store emoji in a database?
The most important requirement for storing emoji in a database is using UTF-8 encoding with full 4-byte support. In MySQL, this means using the 'utf8mb4' character set โ€” MySQL's 'utf8' encoding is actually a 3-byte variant that cannot store any codepoint above U+FFFF, which excludes all Supplementary Multilingual Plane emoji. PostgreSQL's 'UTF8' encoding correctly handles all Unicode codepoints. When defining columns, use a collation that is emoji-aware (e.g., utf8mb4_unicode_ci in MySQL) to ensure correct sorting and comparison behavior. String length functions should also be verified โ€” MySQL's LENGTH() counts bytes while CHAR_LENGTH() counts characters.
How do I compare or deduplicate emoji in code?
Emoji comparison is straightforward when working with normalized Unicode strings: two emoji are equal if their codepoint sequences are identical. However, some emoji can be represented in multiple equivalent forms โ€” for example, variation selectors (U+FE0E, U+FE0F) may or may not be present on copies from different sources. Unicode Normalization Form C (NFC, canonical decomposition followed by canonical composition) is the standard approach to canonicalizing strings before comparison. For emoji specifically, ensuring all sequences are in fully-qualified form (as defined by emoji-test.txt) before comparison prevents false negatives when the same emoji is copied from different platforms.
How can I use a custom emoji set in a web application?
The most portable approach is to store emoji as Unicode codepoint strings and render them using either the OS system fonts or a bundled emoji font like Noto Color Emoji or Twemoji. For applications that need consistent cross-platform rendering, Twemoji (available under CC BY 4.0 for artwork) provides SVG/PNG assets for each emoji that can be served directly. Libraries like `emoji-mart` (React) and `EmojiPicker` (vanilla JS) support custom emoji sets defined as JSON manifests alongside standard Unicode emoji. For backend processing, maintaining a curated list as an array of Unicode codepoint strings ensures portability across programming languages and storage systems.
What is the maximum number of emoji in a Unicode collection?
There is no Unicode-defined maximum for a custom emoji collection โ€” you can include any subset or the full set of 3,953 emoji in Unicode Emoji 16.0. Practically, the performance limit in web applications depends on how the collection is rendered. Rendering all 3,953 emoji simultaneously as DOM elements is feasible on modern hardware, but virtual scrolling is recommended for smooth performance at that scale. When exporting or transmitting emoji collections as text strings, total byte size depends on encoding: a collection of 100 emoji encoded as UTF-8 codepoints averages approximately 400โ€“500 bytes, while collections using full JSON metadata per emoji will be orders of magnitude larger.
How are custom emoji (non-Unicode) handled in messaging platforms?
Major messaging platforms like Slack and Discord support custom emoji uploaded by workspace/server administrators as image files (typically PNG or GIF). These custom emoji are identified by shortcodes unique to that workspace and are not Unicode characters โ€” they are served as images referenced by shortcode in the platform's message format. When exported or shared outside the platform, custom emoji appear as their shortcode text (e.g., ':custom_emoji_name:') since there is no Unicode codepoint backing them. Some platforms like Discord also support animated emoji (using GIF or APNG) and premium emoji restricted to subscribers, all of which are platform-proprietary extensions beyond the Unicode standard.