json2vcf

Convert Nirvana/Illumina Connected Annotations JSON output to VCF 4.2 format.

Features

Pure Python, zero external dependencies
Streaming pipeline — processes one position at a time, no full-file load
Reads .json and .json.gz input
Supports GRCh37 and GRCh38 assemblies (auto-detected from header)
Allele normalization — trims shared prefix/suffix to minimal VCF representation (enabled by default)
Multi-allelic decomposition — splits multi-allelic sites into biallelic rows, like bcftools norm -m- (--decompose)
VEP-style CSQ field with per-transcript annotations
Annotations: gnomAD, ClinVar, SpliceAI, REVEL, DANN, GERP, phyloP, 1000 Genomes, TOPMed

Installation

pip install -e .

Usage

# Basic conversion
json2vcf -i input.json.gz -o output.vcf

# VEP-style CSQ only (no flat INFO fields)
json2vcf -i input.json -o output.vcf --csq-only

# Omit sample/genotype columns
json2vcf -i input.json.gz -o output.vcf --no-samples

# Override genome assembly
json2vcf -i input.json.gz -o output.vcf --assembly GRCh37

# Disable allele normalization (keep raw Nirvana alleles)
json2vcf -i input.json.gz -o output.vcf --no-normalize

# Decompose multi-allelic sites into biallelic rows
json2vcf -i input.json.gz -o output.vcf --decompose

# Output to stdout
json2vcf -i input.json.gz

Development

pip install -e ".[dev]"
python3 -m pytest -v

Architecture

Streaming pipeline: parse → map → write

json2vcf/parser.py — Streams Nirvana's line-based JSON format, yielding (NirvanaHeader, Position) tuples
json2vcf/mapper.py — Transforms positions into VCF record dicts (per-allele fields, CSQ, INFO escaping)
json2vcf/vcf_writer.py — Writes VCF 4.2 plain text
json2vcf/models.py — Dataclass contracts between parser and mapper
json2vcf/constants.py — VCF header definitions, contig maps, CSQ field names

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
json2vcf		json2vcf
scripts		scripts
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
NIRVANA_VCF_CONVERSION_REPORT.md		NIRVANA_VCF_CONVERSION_REPORT.md
PHASE01.md		PHASE01.md
PHASE02.md		PHASE02.md
PHASE03.md		PHASE03.md
README.md		README.md
RESEARCH_NIRVANA_JSON_VCF_CONVERSION.md		RESEARCH_NIRVANA_JSON_VCF_CONVERSION.md
TESTING.md		TESTING.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

json2vcf

Features

Installation

Usage

Development

Architecture

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Abrar-Abir/json2vcf

Folders and files

Latest commit

History

Repository files navigation

json2vcf

Features

Installation

Usage

Development

Architecture

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages