Offline Kanji Data.
Zero Dependencies.

A distilled, lightning-fast database of 13,000+ kanji characters and vocabulary words. Optimized with lazy-loading shards for memory-constrained serverless environments.

~ $ npm install kanji-data

The Problem

Typically, accessing a comprehensive Japanese dictionary offline means parsing a massive 100MB+ JSON file.

Blocks the Node.js event loop, resulting in terrible app startup times.
Consumes 300MB+ of RAM once parsed, instantly crashing serverless environments (Vercel, AWS Lambda).
Relying on SQLite introduces bulky C++ dependencies (`node-gyp`) causing install errors.

The Solution

kanji-data solves the memory problem using build-time data sharding and lazy evaluation.

Instead of shipping one massive file, the database is pre-compiled into tiny, optimized chunks.
Core metadata is loaded instantly.
Massive vocabulary lists are split by Unicode hex-prefixes and only loaded into memory (in ~1MB chunks) exactly when requested.

Zero Dependencies

Pure JavaScript and JSON. No databases, no binaries.

Serverless Ready

Cold starts are nearly instantaneous with a tiny footprint.

100% Offline

No API keys, no rate limits, no network latency.

Smart Caching

Chunks are cached in memory after the first read.

Dead Simple API

Fully typed. Intuitively structured. Ready in milliseconds.

usage.js

const kanji = require('kanji-data');

// 1. Get core metadata instantly
const neko = kanji.get('猫');
console.log(neko.meanings);    // ['cat']
console.log(neko.kun_readings); // ['ねこ']
console.log(neko.jlpt);        // 3

// 2. Fetch vocabulary (Lazily loads a ~1MB shard)
const nekoWords = kanji.getWords('猫');
console.log(nekoWords[0]);
/*
{
  variants: [ { written: "猫", pronounced: "ねこ", priorities: ["spec1"] } ],
  meanings: [ { glosses: ["cat"] } ]
}
*/

// 3. Get entire JLPT lists
const n5 = kanji.getJlpt(5);
console.log(n5); // ['一', '二', '三', '日', '月', ...]

// 4. Get kanji by school grade
const grade1 = kanji.getGrade(1);
console.log(grade1); // ['一', '右', '雨', '円', ...]

// 5. Get ALL 13,000+ kanji
const all = kanji.getAll();
console.log(all.length); // 13108

// 6. Extract kanji from text
kanji.extractKanji('猫が好き'); // ['猫', '好']

// 7. Search by meaning / reading
kanji.search('fire');   // [{ kanji: '火', ... }, ...]
kanji.search('ねこ');   // [{ kanji: '猫', ... }]

// 8. Get a random N5 kanji
kanji.getRandom({ jlpt: 5 }); // { kanji: '日', ... }

Interactive Explorer

Test the output structure directly in your browser.

Official API Reference

get(character: string): Object | null

Returns the core metadata for a given kanji character. This operation is extremely fast as core metadata is kept small and loaded entirely on init. Returns null if the character is not found.

getWords(character: string): Array

Returns an array of vocabulary words that use the specified kanji.

Lazy Loading: Calling this method for the first time on a specific character will trigger a file system read of the ~1MB dictionary shard containing that character's vocabulary. The shard is then cached in memory. If the kanji has no associated words or doesn't exist, returns an empty array [].

getJlpt(level: number): Array<string>

Returns an array of kanji characters that belong to the specified JLPT level (valid inputs: 1 through 5). Returns [] for invalid levels.

getGrade(grade: number): Array<string>

Returns an array of kanji taught in the specified Japanese school grade. Grades 1–6 are elementary school (教育漢字). Grade 8 covers Jōyō kanji not assigned to grades 1–6. Grade 9 covers Jinmeiyō kanji used in names. Returns [] for grades with no data.

getAll(): Array<string>

Returns an array of all ~13,000 kanji characters in the database. This is useful for building custom filters or iterating over the full dataset.

extractKanji(text: string): Array<string>

Extracts unique kanji characters from a string of Japanese text. Only returns characters that exist in the database. Useful for analyzing user input or parsing Japanese content. Returns [] if no kanji are found.

search(query: string): Array<KanjiMetadata>

Searches for kanji by English meaning, kun reading, or on reading. Performs case-insensitive partial matching — searching "cat" will match kanji with meanings like "cat", "scatter", "educate". Returns an array of KanjiMetadata objects.

getByStrokeCount(count: number): Array<KanjiMetadata>

Returns all kanji with the specified stroke count. Input must be a positive integer. Returns [] for invalid input (zero, negative, non-integer) or stroke counts with no kanji.

getRandom(options?): KanjiMetadata | null

Returns a random kanji, optionally filtered by jlpt and/or grade. Perfect for building quiz apps or flashcard games. Returns null when no kanji match the filter criteria.

searchWords(query: string): Array<Word>

Searches for vocabulary words by English meaning or reading across all word shards.

Performance: The first call loads all ~100 word shards into memory. Subsequent calls are instant due to caching. Returns [] if nothing matches.

Offline Kanji Data. Zero Dependencies.