Skip to content

feat(dictionary): AI-powered term extraction from free-form text#344

Open
hanselstner wants to merge 2 commits intoOpenWhispr:mainfrom
hanselstner:feat/dictionary-ai-extract
Open

feat(dictionary): AI-powered term extraction from free-form text#344
hanselstner wants to merge 2 commits intoOpenWhispr:mainfrom
hanselstner:feat/dictionary-ai-extract

Conversation

@hanselstner
Copy link

@hanselstner hanselstner commented Feb 28, 2026

Summary

Extends the dictionary import dialog (from #343) with a second tab that uses the configured AI model to extract domain-specific terms from free-form text (meeting notes, documentation, emails). This makes it easy to build a custom dictionary from existing content.

Depends on: #343 (feat/dictionary-import)

Changes

  • src/components/DictionaryView.tsx: Add tabbed import dialog ("List" / "Extract from text"), AI extraction via ReasoningService.processText() with a terminology-extraction system prompt, extracted terms shown as removable chips before import confirmation
  • All 10 locale files: Add tabList, tabExtract, extractDescription, extractPlaceholder, extractButton, extracting, extractCount, extractEmpty, extractError translation keys

Test plan

  1. Open Dictionary → Import → select "Extract from text" tab
  2. Paste a paragraph containing technical jargon, names, abbreviations
  3. Click "Extract terms" → AI extracts domain-specific words as chips
  4. Remove unwanted terms by clicking ×
  5. Click Import → terms are added to dictionary with deduplication
  6. Verify it works with OpenAI, Anthropic, and Gemini providers
  7. Verify error state when no AI model is configured

Pull Request opened by Augment Code with guidance from the PR author

Add an "Import" dialog to the Custom Dictionary view that lets users
paste a list of words or phrases separated by commas, semicolons, or
line breaks.  This is useful for quickly onboarding domain-specific
vocabulary (e.g. technical terms, brand names, framework names).

The parser splits input by lines first, then by comma/semicolon within
each line.  This preserves terms that contain dots (Vue.js, babylon.js,
.tsx) because the dot is never treated as a delimiter.  Duplicate
detection is case-insensitive and prevents re-adding existing words.

UI integration:
- Empty state: "Import" link below the example chips
- Populated state: "Import" button in the header next to "Clear all"
- Modal dialog with textarea, live word count, and confirm/cancel

Full i18n support across all 10 languages (en, de, es, fr, it, ja,
pt, ru, zh-CN, zh-TW).
@hanselstner hanselstner force-pushed the feat/dictionary-ai-extract branch from fdf01f4 to d0c84d3 Compare February 28, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant