Generate English-like nonce words using configurable phonotactics.
npm install @unglish/word-generatorimport { generateWord, generateWords } from "@unglish/word-generator";
const one = generateWord();
console.log(one.written.clean);
const deterministic = generateWord({ seed: 42 });
console.log(deterministic.written.clean);
const batch = generateWords(5, { seed: 42, mode: "lexicon" });
console.log(batch.map(w => w.written.clean));generateWords(count, { seed }) is deterministic and yields different words in
the same seeded stream.
By default generation includes morphology when the active config enables it.
Pass { morphology: false } for bare root forms.
import { createSeededRng, generateWord } from "@unglish/word-generator";
const rand = createSeededRng(42);
const a = generateWord({ rand });
const b = generateWord({ rand });Use seed for one-off deterministic calls, or pass rand to control a shared
RNG stream.
For n-gram or orthography outliers, use trace: true and inspect word.trace
instead of only checking surface strings.
import { generateWord } from "@unglish/word-generator";
const word = generateWord({ seed: 42, mode: "lexicon", trace: true });
console.log(word.written.clean);
console.log(word.trace?.summary);
console.log(word.trace?.stages[0]);
console.log(word.trace?.graphemeSelections[0]);Detailed trace workflow: docs/word-trace-diagnostics.md
Generation now plans words top-down:
- sample a target phoneme count,
- sample a compatible syllable count,
- distribute onset/coda consonant budgets across syllables,
- generate phonemes, repairs, pronunciation, and spelling.
The built-in English config ships with this wired through:
phonemeLengthWeightsphonemeToSyllableWeights
Custom language configs should provide both tables. They are required parts of
LanguageConfig, not optional tuning extras.
After retuning those tables, run:
npm run analyze:phoneme-length
npm run test:qualityBoundary adjustment probabilities moved to a dedicated
generationWeights.boundaryPolicy object.
import { createGenerator, englishConfig } from "@unglish/word-generator";
const generator = createGenerator({
...englishConfig,
generationWeights: {
...englishConfig.generationWeights,
boundaryPolicy: {
equalSonorityDrop: 90,
risingCodaDrop: 25,
},
},
});Breaking change:
generationWeights.probability.boundaryDropwas removed.- Use
generationWeights.boundaryPolicy.equalSonorityDropinstead.
Stress and aspiration are declarative under pronunciation.
import { createGenerator, englishConfig } from "@unglish/word-generator";
const generator = createGenerator({
...englishConfig,
pronunciation: {
...englishConfig.pronunciation,
stress: {
...englishConfig.pronunciation.stress,
primary: { type: "penultimate" },
},
aspiration: {
enabled: true,
targets: [{ segment: "onset", index: 0, manner: ["stop"], voiced: false }],
rules: [{ id: "word-initial", when: { wordInitial: true }, probability: 100 }],
fallbackProbability: 0,
},
},
});npm test
npm run lint
npm run devAdditional checks:
npm run test:qualitynpm run analyze:phoneme-lengthnpm run test:perfnpm run analyze:phonemesnpm run analyze:trigramsnpm run audit:trace
- Contribution workflow:
CONTRIBUTING.md - Agent-specific constraints:
agents.md - Diagnostics/design docs index:
docs/README.md - Tuning notes and diagnostics:
TUNING.md