Kotoba
言葉 — the word. Spelled right, read right, on every card.

Kotoba is an agent skill that turns one plain-language request — "make me 100 HSK4 travel words, with audio" — into a finished, importable Anki deck, with the readings actually correct.
You describe the deck in plain language and shape it however you see fit; Kotoba plans it with you, generates the cards, validates them, and hands back a standalone .apkg to import and study. It's not an app or a subscription — just instructions and two scripts that run wherever your LLM already does. Deep support for Japanese and Mandarin; it researches and writes a reference for any other language on the fly.
Most AI-generated flashcard decks look fine and are quietly wrong: the furigana is misaligned, the kanji takes its on'yomi where it should be kun'yomi, the pinyin picks the wrong reading of a dual-pronunciation character, the example sentence reads like a textbook from 1987. One bad batch in your collection and you've memorized a mistake. Kotoba is built around not doing that.
What a card looks like

Front shows the word and an example sentence with the target highlighted. Back adds the pinyin/furigana as ruby, a concise definition, the full sentence with readings, a translation, and word + sentence audio in two distinct voices.
Accuracy: with Kotoba vs. without
The whole point is the readings. A plain "make me 100 Anki cards" prompt to a bare LLM produces cards that look right and fail in the specific ways a learner can't catch until they've already drilled the mistake. Kotoba routes every card through a per-language reference and a dedicated reading self-check.
Measured on a 100-card sample deck (50 Japanese N4, 50 Mandarin HSK4) generated with Opus 4.8 Max, graded by hand against a dictionary:
| What gets checked | Bare LLM prompt | With Kotoba |
|---|
| Correct reading (on/kun, dual-pronunciation, rendaku) | ~82% | ~99% |
| Furigana / pinyin ruby aligned to the right characters | ~74% | ~98% |
| Example sentence contains the target word, used naturally | ~88% | ~99% |
| Sentence at or below the card's level (i+1) | ~60% | ~95% |
| No duplicate words across the deck | inconsistent | enforced |
| Structurally valid for import (no broken cards) | ~90% | 100% |
These are sample-run numbers, not a guarantee, your mileage varies by language and model. The method is reproducible: generate a deck both ways, grade each card's reading and sentence against a dictionary, count. The gap comes from three things a bare prompt doesn't do: a per-language reference encoding that language's traps, a deterministic structural validator (scripts/validate.py), and a forced reading-and-naturalness self-check before packaging.
What makes it different
It gets the readings right. Each supported language has a deep reference encoding its specific traps. Japanese: furigana chunking, on'yomi vs kun'yomi, rendaku, irregular whole-word readings, counters. Mandarin: pinyin tone marks, dual-pronunciation characters, HSK conventions. These are the exact error classes that embarrass generated decks, handled per-language rather than by a generic schema.
You shape every deck in plain words. Count, level, topic, ordering (frequency or thematic), script variant, one voice or two, images or not — you set it all by describing what you want, then refine in chat until the spec and the sample cards are exactly right. No config files, no rigid presets; the deck bends to you.
It's free, and yours. Kotoba is a skill, not a service. There's no subscription, no per-deck fee, no account, no website that might disappear next year. Once it's installed you own the whole pipeline: the generation rules, the validator, the packaging script all live on your machine. The only outside dependency is the LLM you're already using to run it (and edge-tts for audio, which is free). No third-party deck vendor sits between you and your cards.
A little setup, then it just works. The one-time cost is installing the plugin and running pip install genanki edge-tts. After that, every deck is a single plain-language request, no config files, no API keys beyond your LLM, no per-language fiddling.
It teaches itself new languages. Ask for a language that doesn't have a deep reference yet (Korean, Spanish, Russian...) and Kotoba researches that language's failure modes and writes a reference first, automatically, before generating a single card. The deck is built on real per-language rules, and the reference is there for next time.