-
Notifications
You must be signed in to change notification settings - Fork 0
Features
inactinique edited this page Jan 28, 2026
·
7 revisions
Version: 1.0.0-rc.1
This document provides an overview of ClioDeck's main features.
ClioDeck integrates seamlessly with Zotero for bibliography management:
- Import from Zotero: Connect to your Zotero library and import collections with a single click
- Bidirectional Sync: Detect changes (additions, modifications, deletions) between local and Zotero bibliographies
- Conflict Resolution: Three strategies - Remote Wins, Local Wins, or Manual selection
- PDF Download: Automatically download PDFs from Zotero attachments
- Collection Selection: Choose which Zotero collection to sync
- Group Support: Access Zotero group libraries
-
Import: Load existing
.bibfiles - Export: Export bibliography to BibTeX format with all metadata preserved
- Round-trip: Full preservation of custom fields, tags, and notes during import/export
- Automatic Indexing: Index PDFs for semantic search using RAG
- Batch Download: Download all missing PDFs from Zotero
- Orphan Detection: Find and clean up PDFs not linked to any citation
- Re-indexation: Detect modified PDFs and propose re-indexing
- Archive Option: Safely move orphan PDFs to archive folder instead of deleting
- Custom Tags: Organize citations with user-defined tags
- Tag Filtering: Filter bibliography by one or more tags
- Custom Fields: Store additional metadata not covered by BibTeX
- Notes: Add personal notes to citations
- Date Tracking: Automatic timestamps for added/modified citations
Interactive statistics with 4 tabs:
- Overview: Total counts, year range, PDF coverage, publication types
- Authors: Top 15 authors, collaboration metrics, publication years
- Publications: Top journals, yearly distribution histogram
- Timeline: Cumulative and annual publication trends
ClioDeck integrates with Tropy for managing primary sources:
-
Import Tropy Projects: Read
.tropypackages and.tpydatabases - Metadata Sync: Import title, date, creator, archive, collection, tags
- Transcription Support: Import transcriptions from Tropy notes, OCR (Tesseract), or Transkribus
- Unified RAG: Search both secondary sources (PDFs) and primary sources (Tropy) together
- Auto-sync: Detect changes in Tropy files and propose re-synchronization
- OCR Pipeline: Built-in Tesseract.js for images without transcription
- Semantic Search: Query your indexed PDFs and primary sources using natural language
- Context Retrieval: Automatically retrieves relevant passages from your corpus
- Source Citations: Every answer includes references to source documents
- Multi-source Search: Combines results from PDFs and Tropy sources
- Configurable Parameters: Adjust topK, similarity threshold, chunking strategy
- Query Embedding Cache: Optimized performance with LRU cache (500 entries, 60min TTL)
- Context Compression: Automatic compression when context exceeds LLM limits
- RAG Explanation: Transparency about retrieval process (chunks used, compression ratio, source types)
- Stream Cancellation: Cancel ongoing generation at any time
- Ollama Support: Use local LLMs via Ollama (default: gemma2:2b)
- Embedded LLM: Download and run models directly (Qwen2.5-0.5B, 1.5B) for offline use
- Auto Fallback: Automatically switches between Ollama and embedded model
- Claude/OpenAI: Connect to cloud LLM providers (optional)
- System Prompts: Customizable system prompts in French, English, and German
- HNSW Index: Fast approximate nearest neighbor search (~15ms for 50k chunks)
- BM25 Search: Keyword-based search for proper nouns and acronyms
- RRF Fusion: Reciprocal Rank Fusion combining both approaches (60% dense / 40% sparse)
- Multilingual Query Expansion: Automatic FR↔EN translation for academic terms (e.g., "primary sources" ↔ "sources primaires")
- Exact Match Boosting: Priority for exact keyword matches
- BERTopic Integration: Identify main themes in your corpus
- Topic Timeline: Visualize theme evolution over time
- Python Environment: Isolated Python venv for dependencies
- Optional Feature: Install only if needed
- Document Network: Visualize relationships between documents
- Citation Links: Track internal citations within your corpus
- Similarity Edges: Connect semantically similar documents
- Community Detection: Identify document clusters (Louvain algorithm)
- Interactive Exploration: ForceAtlas2 layout with zoom and pan
- Word Frequencies: Most common words (stopwords removed)
- N-grams: Bigrams and trigrams analysis
- Lexical Richness: Type-Token Ratio and vocabulary metrics
- TF-IDF: Characteristic words identification
- Document Comparison: Compare your text with indexed sources
- Segment Analysis: Analyze by section, paragraph, or sentence
- Recommendations: Get relevant source suggestions for each segment
- Smart Cache: Hash-based caching for performance
- WYSIWYG Markdown: Visual editing with full markdown support (Milkdown)
-
Citation Autocomplete: Type
@to insert citations from bibliography - Footnotes: Visual styling for footnotes in both dark and light themes
- Live Preview: See formatted output as you type
- Auto-save: Periodic saving with draft recovery
- Keyboard Shortcuts: Standard formatting shortcuts (Ctrl+B, Ctrl+I, etc.)
- PDF Export: Generate professional PDFs via Pandoc/LaTeX
- Word Export: Export to .docx format with template support
- Session Tracking: Track research sessions and activities
- Chat History: Review past conversations with the AI assistant
- Timeline View: Visualize activity over time
- Context Recovery: Resume previous conversations
- Date/Time Display: Sessions show both date and time
- Filter Empty Sessions: Hide sessions without activity
- Project Types: Article, Book, or Presentation projects
- Recent Projects: Quick access to recently opened projects
- Project Settings: CSL styles, export options
- Database Actions: Purge, rebuild, and optimize project database
- Per-project Configuration: Independent settings for each project
- Dark/Light Mode: Toggle between themes
- Auto Theme: Automatic switching based on time of day
- Consistent Styling: All components adapt to selected theme
- Languages: French, English, German
- Auto-detection: Detects system language on first launch
- Menu Translations: Complete localization including menus
- HNSW Index: Fast approximate nearest neighbor search (hnswlib-node)
- BM25 Index: Sparse search for keywords (natural.js)
- SQLite Storage: Persistent storage for chunks and metadata
- Separate Stores: Independent databases for PDFs and primary sources
| Strategy | Chunk Size | Overlap | Use Case |
|---|---|---|---|
| cpuOptimized | 300 words | 50 | Modest machines (8GB RAM) |
| standard | 500 words | 75 | Balanced performance |
| large | 800 words | 100 | Maximum precision (16GB+ RAM) |
| Parameter | Default | Description |
|---|---|---|
| topK | 10 | Number of chunks to retrieve |
| similarityThreshold | 0.12 | Minimum score (RRF-optimized) |
| useHybridSearch | true | Combine HNSW + BM25 |
| enableQualityFiltering | true | Filter low-quality chunks |
| enableDeduplication | true | Remove duplicate chunks |
- Settings Panel: Centralized configuration for all features
- Per-project Settings: Some settings (like database actions) are project-specific
- Persistent Storage: Settings saved via Electron Store
For detailed technical documentation on specific features, see the individual feature documentation files.