Offline Kanji Data.
Zero Dependencies.
A distilled, lightning-fast database of 13,000+ kanji characters and vocabulary words. Optimized with lazy-loading shards for memory-constrained serverless environments.
npm install kanji-data
The Problem
Typically, accessing a comprehensive Japanese dictionary offline means parsing a massive 100MB+ JSON file.
- Blocks the Node.js event loop, resulting in terrible app startup times.
- Consumes 300MB+ of RAM once parsed, instantly crashing serverless environments (Vercel, AWS Lambda).
- Relying on SQLite introduces bulky C++ dependencies (`node-gyp`) causing install errors.
The Solution
kanji-data solves the memory problem using build-time data sharding and lazy evaluation.
- Instead of shipping one massive file, the database is pre-compiled into tiny, optimized chunks.
- Core metadata is loaded instantly.
- Massive vocabulary lists are split by Unicode hex-prefixes and only loaded into memory (in ~1MB chunks) exactly when requested.
Zero Dependencies
Pure JavaScript and JSON. No databases, no binaries.
Serverless Ready
Cold starts are nearly instantaneous with a tiny footprint.
100% Offline
No API keys, no rate limits, no network latency.
Smart Caching
Chunks are cached in memory after the first read.
Dead Simple API
Fully typed. Intuitively structured. Ready in milliseconds.
const kanji = require('kanji-data');
// 1. Get core metadata instantly
const neko = kanji.get('猫');
console.log(neko.meanings); // ['cat']
console.log(neko.kun_readings); // ['ねこ']
console.log(neko.jlpt); // 3
// 2. Fetch vocabulary (Lazily loads a ~1MB shard)
const nekoWords = kanji.getWords('猫');
console.log(nekoWords[0]);
/*
{
variants: [ { written: "猫", pronounced: "ねこ", priorities: ["spec1"] } ],
meanings: [ { glosses: ["cat"] } ]
}
*/
// 3. Get entire JLPT lists
const n5 = kanji.getJlpt(5);
console.log(n5); // ['一', '二', '三', '日', '月', ...]
// 4. Get kanji by school grade
const grade1 = kanji.getGrade(1);
console.log(grade1); // ['一', '右', '雨', '円', ...]
// 5. Get ALL 13,000+ kanji
const all = kanji.getAll();
console.log(all.length); // 13108
// 6. Extract kanji from text
kanji.extractKanji('猫が好き'); // ['猫', '好']
// 7. Search by meaning / reading
kanji.search('fire'); // [{ kanji: '火', ... }, ...]
kanji.search('ねこ'); // [{ kanji: '猫', ... }]
// 8. Get a random N5 kanji
kanji.getRandom({ jlpt: 5 }); // { kanji: '日', ... }
Interactive Explorer
Test the output structure directly in your browser.
Official API Reference
get(character: string): Object | null
Returns the core metadata for a given
kanji character. This operation is extremely fast as core metadata is kept small and loaded
entirely on init. Returns null if
the character is not found.
getWords(character: string): Array
Returns an array of vocabulary words
that use the specified kanji.
Lazy Loading: Calling
this method for the first time on a specific character will trigger a file system read of the
~1MB dictionary shard containing that character's vocabulary. The shard is then cached in
memory. If the kanji has no associated words or doesn't exist, returns an empty array [].
getJlpt(level: number): Array<string>
Returns an array of kanji characters
that belong to the specified JLPT level (valid inputs: 1 through
5).
Returns [] for
invalid levels.
getGrade(grade: number): Array<string>
Returns an array of kanji taught in
the specified Japanese school grade. Grades 1–6 are elementary school (教育漢字). Grade 8 covers
Jōyō kanji not assigned to grades 1–6. Grade 9 covers Jinmeiyō kanji used in names. Returns
[] for
grades with no data.
getAll(): Array<string>
Returns an array of all ~13,000 kanji characters in the database. This is useful for building custom filters or iterating over the full dataset.
extractKanji(text: string): Array<string>
Extracts unique kanji characters from
a string of Japanese text. Only returns characters that exist in the database. Useful for
analyzing user input or parsing Japanese content. Returns [] if no
kanji are found.
search(query: string): Array<KanjiMetadata>
Searches for kanji by English meaning,
kun reading, or on reading. Performs case-insensitive
partial matching — searching "cat" will match kanji with meanings like "cat",
"scatter", "educate". Returns an array of KanjiMetadata
objects.
getByStrokeCount(count: number): Array<KanjiMetadata>
Returns all kanji with the specified
stroke count. Input must be a positive integer. Returns [] for
invalid input (zero, negative, non-integer) or stroke counts with no kanji.
getRandom(options?): KanjiMetadata | null
Returns a random kanji, optionally
filtered by jlpt
and/or grade.
Perfect for building quiz apps or flashcard games. Returns null when
no kanji match the filter criteria.
searchWords(query: string): Array<Word>
Searches for vocabulary words by
English meaning or reading across all word
shards.
Performance: The first
call loads all ~100 word shards into memory. Subsequent calls are instant due to caching.
Returns [] if
nothing matches.