Features

ClioDeck Features

Version: 1.0.0-rc.1

This document provides an overview of ClioDeck's main features.

Bibliography Management

Zotero Integration

ClioDeck integrates seamlessly with Zotero for bibliography management:

Import from Zotero: Connect to your Zotero library and import collections with a single click
Bidirectional Sync: Detect changes (additions, modifications, deletions) between local and Zotero bibliographies
Conflict Resolution: Three strategies - Remote Wins, Local Wins, or Manual selection
PDF Download: Automatically download PDFs from Zotero attachments
Collection Selection: Choose which Zotero collection to sync
Group Support: Access Zotero group libraries

BibTeX Support

Import: Load existing .bib files
Export: Export bibliography to BibTeX format with all metadata preserved
Round-trip: Full preservation of custom fields, tags, and notes during import/export

PDF Management

Automatic Indexing: Index PDFs for semantic search using RAG
Batch Download: Download all missing PDFs from Zotero
Orphan Detection: Find and clean up PDFs not linked to any citation
Re-indexation: Detect modified PDFs and propose re-indexing
Archive Option: Safely move orphan PDFs to archive folder instead of deleting

Tags and Metadata

Custom Tags: Organize citations with user-defined tags
Tag Filtering: Filter bibliography by one or more tags
Custom Fields: Store additional metadata not covered by BibTeX
Notes: Add personal notes to citations
Date Tracking: Automatic timestamps for added/modified citations

Statistics Dashboard

Interactive statistics with 4 tabs:

Overview: Total counts, year range, PDF coverage, publication types
Authors: Top 15 authors, collaboration metrics, publication years
Publications: Top journals, yearly distribution histogram
Timeline: Cumulative and annual publication trends

Primary Sources (Tropy Integration)

ClioDeck integrates with Tropy for managing primary sources:

Import Tropy Projects: Read .tropy packages and .tpy databases
Metadata Sync: Import title, date, creator, archive, collection, tags
Transcription Support: Import transcriptions from Tropy notes, OCR (Tesseract), or Transkribus
Unified RAG: Search both secondary sources (PDFs) and primary sources (Tropy) together
Auto-sync: Detect changes in Tropy files and propose re-synchronization
OCR Pipeline: Built-in Tesseract.js for images without transcription

AI-Powered Research Assistant

RAG (Retrieval-Augmented Generation)

Semantic Search: Query your indexed PDFs and primary sources using natural language
Context Retrieval: Automatically retrieves relevant passages from your corpus
Source Citations: Every answer includes references to source documents
Multi-source Search: Combines results from PDFs and Tropy sources
Configurable Parameters: Adjust topK, similarity threshold, chunking strategy
Query Embedding Cache: Optimized performance with LRU cache (500 entries, 60min TTL)
Context Compression: Automatic compression when context exceeds LLM limits
RAG Explanation: Transparency about retrieval process (chunks used, compression ratio, source types)
Stream Cancellation: Cancel ongoing generation at any time

LLM Integration

Ollama Support: Use local LLMs via Ollama (default: gemma2:2b)
Embedded LLM: Download and run models directly (Qwen2.5-0.5B, 1.5B) for offline use
Auto Fallback: Automatically switches between Ollama and embedded model
Claude/OpenAI: Connect to cloud LLM providers (optional)
System Prompts: Customizable system prompts in French, English, and German

Hybrid Search

HNSW Index: Fast approximate nearest neighbor search (~15ms for 50k chunks)
BM25 Search: Keyword-based search for proper nouns and acronyms
RRF Fusion: Reciprocal Rank Fusion combining both approaches (60% dense / 40% sparse)
Multilingual Query Expansion: Automatic FR↔EN translation for academic terms (e.g., "primary sources" ↔ "sources primaires")
Exact Match Boosting: Priority for exact keyword matches

Topic Modeling (Optional)

BERTopic Integration: Identify main themes in your corpus
Topic Timeline: Visualize theme evolution over time
Python Environment: Isolated Python venv for dependencies
Optional Feature: Install only if needed

Corpus Analysis

Knowledge Graph

Document Network: Visualize relationships between documents
Citation Links: Track internal citations within your corpus
Similarity Edges: Connect semantically similar documents
Community Detection: Identify document clusters (Louvain algorithm)
Interactive Exploration: ForceAtlas2 layout with zoom and pan

Textometrics

Word Frequencies: Most common words (stopwords removed)
N-grams: Bigrams and trigrams analysis
Lexical Richness: Type-Token Ratio and vocabulary metrics
TF-IDF: Characteristic words identification

Similarity Finder

Document Comparison: Compare your text with indexed sources
Segment Analysis: Analyze by section, paragraph, or sentence
Recommendations: Get relevant source suggestions for each segment
Smart Cache: Hash-based caching for performance

Document Editing

Milkdown Editor

WYSIWYG Markdown: Visual editing with full markdown support (Milkdown)
Citation Autocomplete: Type @ to insert citations from bibliography
Footnotes: Visual styling for footnotes in both dark and light themes
Live Preview: See formatted output as you type
Auto-save: Periodic saving with draft recovery
Keyboard Shortcuts: Standard formatting shortcuts (Ctrl+B, Ctrl+I, etc.)

Export Options

PDF Export: Generate professional PDFs via Pandoc/LaTeX
Word Export: Export to .docx format with template support

Research Journal

Session Tracking: Track research sessions and activities
Chat History: Review past conversations with the AI assistant
Timeline View: Visualize activity over time
Context Recovery: Resume previous conversations
Date/Time Display: Sessions show both date and time
Filter Empty Sessions: Hide sessions without activity

Project Management

Project Types: Article, Book, or Presentation projects
Recent Projects: Quick access to recently opened projects
Project Settings: CSL styles, export options
Database Actions: Purge, rebuild, and optimize project database
Per-project Configuration: Independent settings for each project

User Interface

Themes

Dark/Light Mode: Toggle between themes
Auto Theme: Automatic switching based on time of day
Consistent Styling: All components adapt to selected theme

Internationalization

Languages: French, English, German
Auto-detection: Detects system language on first launch
Menu Translations: Complete localization including menus

Technical Features

Vector Database

HNSW Index: Fast approximate nearest neighbor search (hnswlib-node)
BM25 Index: Sparse search for keywords (natural.js)
SQLite Storage: Persistent storage for chunks and metadata
Separate Stores: Independent databases for PDFs and primary sources

Chunking Strategies

Strategy	Chunk Size	Overlap	Use Case
cpuOptimized	300 words	50	Modest machines (8GB RAM)
standard	500 words	75	Balanced performance
large	800 words	100	Maximum precision (16GB+ RAM)

RAG Configuration

Parameter	Default	Description
topK	10	Number of chunks to retrieve
similarityThreshold	0.12	Minimum score (RRF-optimized)
useHybridSearch	true	Combine HNSW + BM25
enableQualityFiltering	true	Filter low-quality chunks
enableDeduplication	true	Remove duplicate chunks

Configuration

Settings Panel: Centralized configuration for all features
Per-project Settings: Some settings (like database actions) are project-specific
Persistent Storage: Settings saved via Electron Store

For detailed technical documentation on specific features, see the individual feature documentation files.

Features

ClioDeck Features

Bibliography Management

Zotero Integration

BibTeX Support

PDF Management

Tags and Metadata

Statistics Dashboard

Primary Sources (Tropy Integration)

AI-Powered Research Assistant

RAG (Retrieval-Augmented Generation)

LLM Integration

Hybrid Search

Topic Modeling (Optional)

Corpus Analysis

Knowledge Graph

Textometrics

Similarity Finder

Document Editing

Milkdown Editor

Export Options

Research Journal

Project Management

User Interface

Themes

Internationalization

Technical Features

Vector Database

Chunking Strategies

RAG Configuration

Configuration

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally