Lupin CLI

Lupin CLI is a command-line toolkit designed to verify the authenticity of digital documents using state-of-the-art image forensics, metadata inspection, and LLM-powered text analysis. It brings together a collection of proven forensic algorithms and modern AI techniques into a single, unified, scriptable interface.

The tool is built for investigators, journalists, researchers, archivists, and security teams who need a fast and reliable way to assess whether images or documents have been manipulated. Lupin CLI performs deep analysis at multiple levels: compression artifacts, sensor noise patterns, lighting consistency, metadata coherence, and cross-checked AI summaries that consolidate all findings into a human-readable verdict.

Installation

# Install UV if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install
git clone https://github.com/your-repo/lupin-cli.git
cd lupin-cli
make install

# (Optional) Configure LLM for AI-powered summaries
lupin bootstrap

Quick Start

# Analyze an image
lupin check image.jpg

# Analyze a directory recursively
lupin check ./documents/ --recursive

# Export results to CSV
lupin check *.jpg --output results.csv

Analyzers

Lupin CLI uses multiple forensic techniques to detect image manipulation. Each analyzer focuses on a specific aspect of image authenticity:

Analyzer	Method	What It Detects	Best For
Metadata	EXIF/XMP analysis	Missing camera data, editing software traces, timestamp anomalies	Detecting edited photos, identifying source
ELA	Error Level Analysis	Compression artifact inconsistencies from local edits	JPEG manipulation, splicing, cloning
JPEG Ghost	Multi-quality recompression	Regions saved at different JPEG qualities	Detecting pasted content from other JPEGs
Double JPEG	DCT coefficient analysis	Images compressed multiple times	Re-saved/edited JPEGs
Copy-Move	Keypoint matching (ORB)	Duplicated/cloned regions within the image	Object removal, duplication fraud
PRNU	Sensor noise pattern analysis	Inconsistent camera sensor fingerprints	Spliced regions from different cameras
Interpolation	Frequency domain analysis	Resampling artifacts from resize/rotate	Scaled or rotated paste operations
Noise Inconsistency	Block-wise noise variance	Regions with different noise characteristics	Composited images from multiple sources
Chromatic Aberration	Color channel misalignment	Inconsistent lens distortion patterns	Spliced content from different lenses
Shadow/Lighting	Gradient and shadow analysis	Inconsistent light direction or shadow color	Composited objects with wrong lighting
Visual Inconsistency	Edge, texture, color analysis	Abrupt visual discontinuities	General compositing and local edits
LLM Summary	AI cross-analysis	Synthesizes all results into verdict	Human-readable final assessment

Detection Strategy

The analyzers work together using complementary approaches:

┌─────────────────────────────────────────────────────────────────────────┐
│                         IMAGE AUTHENTICITY                              │
├─────────────────────────────────────────────────────────────────────────┤
│  COMPRESSION ARTIFACTS          SENSOR/NOISE PATTERNS                   │
│  ├── ELA                        ├── PRNU                                │
│  ├── JPEG Ghost                 ├── Noise Inconsistency                 │
│  └── Double JPEG                └── Chromatic Aberration                │
├─────────────────────────────────────────────────────────────────────────┤
│  GEOMETRIC TRANSFORMS           VISUAL CONSISTENCY                      │
│  ├── Interpolation              ├── Visual Inconsistency                │
│  └── Copy-Move                  └── Shadow/Lighting                     │
├─────────────────────────────────────────────────────────────────────────┤
│  METADATA                       AI SYNTHESIS                            │
│  └── EXIF/XMP Analysis          └── LLM Summary                         │
└─────────────────────────────────────────────────────────────────────────┘

Scoring: Each analyzer produces a score (0.0 = manipulated, 1.0 = authentic) and confidence level. The overall score is calculated using both confidence and analyzer reliability weights:

Reliability	Analyzers	Weight
High	Metadata, ELA, Copy-Move, Double JPEG	0.85 - 1.0
Medium	Interpolation, PRNU, Visual Inconsistency, Chromatic Aberration	0.6 - 0.7
Lower	JPEG Ghost, Noise Inconsistency, Shadow/Lighting	0.4 - 0.5

This weighting ensures that analyzers prone to false positives on authentic images don't unfairly drag down the overall score.

Interpretation:

0.77 - 1.0: Likely authentic
0.35 - 0.77: Uncertain, requires review
0.0 - 0.35: Likely manipulated

Understanding the Output

When you export results to CSV (--output results.csv), Lupin creates a spreadsheet you can open in Excel, Google Sheets, or any spreadsheet application.

CSV Columns Explained

Column	What It Means
file_path	The file that was analyzed
file_hash	SHA-256 hash of file contents (for verification/deduplication)
file_type	Type of file (image, pdf, docx)
is_embedded	"Yes" if this is an image extracted from a PDF/DOCX
parent_file	If embedded, which document contained this image
method	Which analyzer produced this row (or "FINAL" for the summary)
method_score	Score from 0.0 to 1.0 (higher = more likely authentic)
method_confidence	How confident the analyzer is (0.0 to 1.0)
method_details	Technical details about what was found
verdict	Only in FINAL rows: "LIKELY AUTHENTIC", "UNCERTAIN", or "LIKELY MANIPULATED"
llm_verdict	AI's assessment (if LLM is configured)
llm_reasoning	AI's explanation of its verdict
llm_provider	Which AI service was used

How to Read the Results

Look at FINAL rows first - Filter the spreadsheet to show only rows where method = "FINAL". These give you the overall verdict for each file.
Check the verdict column - This is your quick answer:
- LIKELY AUTHENTIC - No strong signs of manipulation detected
- UNCERTAIN - Some anomalies found; manual review recommended
- LIKELY MANIPULATED - Multiple indicators suggest the image was edited
Review individual analyzer scores - If a file is marked UNCERTAIN or LIKELY MANIPULATED, look at which analyzers gave low scores to understand what type of manipulation may have occurred.
Consider the context - A low score on one analyzer isn't proof of manipulation. Look for patterns across multiple analyzers.

Example: Reading a Result

file_path: photo.jpg
method: FINAL
method_score: 0.42
verdict: UNCERTAIN
llm_verdict: LIKELY MANIPULATED
llm_reasoning: "ELA shows compression inconsistencies in the upper-left region.
                Metadata indicates the image was processed with Photoshop."

This tells you:

The overall score (0.42) falls in the uncertain range
The AI detected specific issues with compression and metadata
You should examine the image more closely, particularly the upper-left area

Tips for Non-Technical Users

Start with the verdict - Don't get overwhelmed by numbers; the verdict column gives you the bottom line
Trust patterns, not single scores - One low score might be a false positive; multiple low scores are more significant
LLM reasoning is your friend - If configured, the AI explanation translates technical findings into plain language
When in doubt, flag for review - UNCERTAIN means exactly that; get a second opinion from an expert

Important Caveats

Absence of evidence is not evidence of absence. A "LIKELY AUTHENTIC" verdict means no manipulation was detected by these specific techniques - it does not guarantee the image is unaltered. Sophisticated edits, AI-generated images, or manipulations outside the scope of these analyzers may go undetected.

Extraordinary claims require extraordinary evidence. If you're making serious accusations based on analysis results, ensure you have strong, corroborating evidence from multiple sources. A single tool's output - no matter how sophisticated - should be one piece of a larger investigation, not the sole basis for conclusions. Always consider alternative explanations and seek expert validation for high-stakes decisions.

Configuration

Quick Setup

lupin bootstrap

This interactive wizard configures your LLM provider (Anthropic, OpenAI, or OLLAMA) and saves settings to .env.

To verify your configuration:

lupin config

Manual Configuration

# LLM Provider (anthropic, openai, or ollama)
export LUPIN_LLM_PROVIDER=anthropic
export LUPIN_ANTHROPIC_API_KEY=your_key

# Or for local OLLAMA
export LUPIN_LLM_PROVIDER=ollama
export LUPIN_OLLAMA_HOST=http://localhost:11434
export LUPIN_OLLAMA_MODEL=llama3.2

GPU Acceleration (Optional)

# Install GPU support (requires NVIDIA GPU + CUDA 12.x)
uv sync --extra gpu

GPU provides 2-3x speedup for Visual Inconsistency and PRNU analysis. The tool automatically falls back to CPU if GPU is unavailable.

Development

# Install dev dependencies
make dev

# Run tests
make test

# Run linter
make lint

# Format code
make format

Supported File Types

Images: Any format recognized by your system's file type detection, including:
- Common formats: JPEG, PNG, GIF, BMP, TIFF, WebP, AVIF
- Professional formats: PSD, EPS, PCX, TGA, DDS
- Scientific formats: FITS, HDF5
- Other formats: ICO, ICNS, PPM, PGM, PBM, SGI, QOI, etc.
Documents: PDF, DOCX, DOC

Limitations

Image forensics work best on JPEG images
Text analysis requires LLM API access
PRNU analysis is most effective on images from the same camera
Results are indicators, not definitive proof

Why the Name "Lupin"?

Inspired by Arsène Lupin, the master of disguise in French literature, this tool aims to reveal what is hidden behind the surface: uncovering manipulations, inconsistencies, and digital "disguises" that might go unnoticed during manual inspection.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
src/lupin_cli		src/lupin_cli
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
EXAMPLES.md		EXAMPLES.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Lupin CLI

Installation

Quick Start

Analyzers

Detection Strategy

Understanding the Output

CSV Columns Explained

How to Read the Results

Example: Reading a Result

Tips for Non-Technical Users

Important Caveats

Configuration

Quick Setup

Manual Configuration

GPU Acceleration (Optional)

Development

Supported File Types

Limitations

Why the Name "Lupin"?

License

About

Uh oh!

Releases

Packages

Languages

pirhoo/lupin-cli

Folders and files

Latest commit

History

Repository files navigation

Lupin CLI

Installation

Quick Start

Analyzers

Detection Strategy

Understanding the Output

CSV Columns Explained

How to Read the Results

Example: Reading a Result

Tips for Non-Technical Users

Important Caveats

Configuration

Quick Setup

Manual Configuration

GPU Acceleration (Optional)

Development

Supported File Types

Limitations

Why the Name "Lupin"?

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages