This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
SeqAn2 headers are vendored in the yara-seqan2-sys crate, so no external checkout is needed:
cargo build # debug build
cargo build --release # release build
cargo clippy --all-targets # lint (no clippy.toml; use default rules)
cargo fmt --check # format check
cargo run --example build_index -- <ref.fasta> # build index (from yara-mapper crate)
cargo run --example align -- <index_prefix> # map reads (from yara-mapper crate)To use a local SeqAn2 checkout instead of the vendored headers, set SEQAN_DIR:
export SEQAN_DIR=/path/to/seqan # optional override (contains include/ and apps/yara/)On macOS, OpenMP is detected via Homebrew libomp (/opt/homebrew/opt/libomp/). Override with HOMEBREW_PREFIX if needed. On Linux, libgomp is used.
Three-crate workspace wrapping the YARA read mapper and indexer (SeqAn2 C++ template library) via FFI, allowing Rust code to build an FM index from a FASTA file and map paired-end DNA reads without SAM serialization.
Rust caller
→ yara-mapper crate (safe API: YaraIndexer, YaraMapper, ReadBatch, YaraRecord)
→ yara-mapper-sys crate (raw extern "C" FFI bindings + build.rs)
→ yara-seqan2-sys crate (vendored SeqAn2 headers, exposes paths via links metadata)
→ yara_indexer_shim.cpp (C++ shim wrapping SeqAn2 indexer templates)
→ yara_shim.cpp (C++ shim wrapping SeqAn2 mapper templates)
→ YARA indexer / mapper (SeqAn2 header-only library)
vendor/include/— full SeqAn2 header-only libraryvendor/apps/yara/— YARA application headersbuild.rs— exposesDEP_SEQAN2_INCLUDEandDEP_SEQAN2_YARA_APPvia cargolinksmetadata- No Rust code; header-only, no compilation
build.rs— compilescpp/yara_shim.cppandcpp/yara_indexer_shim.cppvia thecccrate, linking SeqAn2 headers, zlib, and OpenMPcpp/yara_shim.h/cpp/yara_shim.cpp— mapper C API: 7 functions, 4repr(C)structs;RecordCollectorreplaces YARA's SAM writer; factory chain dispatches by contig limits + sensitivitycpp/yara_indexer_shim.h/cpp/yara_indexer_shim.cpp— indexer C API: 5 functions, 2 structs; builds FM index from FASTA, saves 5 index files (.txt,.rid,.txt.size,.sa,.lf); factory chain dispatches by contig limitssrc/lib.rs— rawunsafe extern "C"declarations matching both C headers
indexer.rs—YaraIndexer(owns aNonNull<YaraIndexerHandle>,Sendbut notSync),IndexerOptionsmapper.rs—YaraMapper(owns aNonNull<YaraMapperHandle>,Sendbut notSyncdue to internal OpenMP),ReadBatch(collectsCStringpairs)options.rs—MapperOptionswithto_ffi()conversion; enumsSensitivityandSecondaryModerecord.rs—YaraRecord(fully owned, all heap data copied from C++ then freed),CigarOperror.rs—YaraErrorenum:IndexOpen,IndexBuild,Mapping,InvalidInput
C++ allocates heap memory for CIGAR arrays, seq/qual strings, and XA tags inside each YaraAlignmentRecord. The Rust side copies these into owned types (Vec<CigarOp>, Vec<u8>, String) in convert_record(), then calls yara_mapper_free_records() to deallocate the C++ memory. On error (map_paired returns -1), C++ frees partial records itself before returning — the Rust side does not call free_records in the error path.
The output buffer is zero-initialized once by C++ at the start of mapPaired (single memset). Neither _nextRecord nor the Rust caller perform additional zeroing.
YaraMapper is Send but not Sync. OpenMP parallelism is internal to each map_paired call; concurrent calls from multiple threads are not safe.