Official PyTorch implementation of HBANet, introduced in “HBANet: A Hybrid Boundary-Aware Attention Network for Infrared and Visible Image Fusion” (CVIU 2024).
Xubo Luo1, Jinshuo Zhang2, Liping Wang3, Dongmei Niu2
1 Shanghai University of Finance and Ecnomics 2 Jinan University 3 Jinan Fourth Hospital
Seamlessly fusing thermal saliency and visible detail via hybrid attention.
HBANet unifies infrared (IR) and visible (VIS) imagery through a dual-branch encoder, a Hybrid Boundary-Aware Attention (HBA) module, and a lightweight decoder. The HBA module couples boundary-sensitive spatial attention with cross-domain feature exchange, enabling crisp edge preservation and faithful intensity reconstruction. Training leverages a hybrid fusion loss that balances structure fidelity, brightness consistency, and spatial smoothness.
- Shared Encoder – A single convolutional backbone extracts modality-agnostic representations while respecting low-level contrast differences.
- Hybrid Attention – BAAU injects VIS-derived boundary priors; CDAU performs bidirectional multi-head attention across IR/VIS streams.
-
Physics-aware Objective – Structure, intensity, and total variation losses jointly guide fusion quality with default weights
$(1.0, 10.0, 0.5)$ . - Plug-and-Play – Minimal dependencies, fast inference, and modular design for research or production deployments.
| Stage | Description |
|---|---|
| Dual-Branch Encoder | Conv–BN–ReLU stack followed by residual blocks (shared weights) produce IR/VIS feature pyramids. |
| Boundary-Aware Attention Unit (BAAU) | Generates a Sobel-based boundary prior from the VIS input to refine spatial saliency. |
| Cross-Domain Attention Unit (CDAU) | Multi-head cross-attention enables global, modality-aware fusion between the two feature streams. |
| Decoder | Residual refinement followed by pointwise projection reconstructs the fused grayscale image. |
Refer to details.md for an in-depth breakdown.
- Python ≥ 3.9
- PyTorch ≥ 1.12 with CUDA support (optional but recommended)
- Additional dependencies listed in
requirements.txt
git clone https://github.com/LuoXubo/HBANet.git
cd HBANet
pip install -r requirements.txt- Download paired IR–VIS datasets (e.g., TNO, RoadScene, LLVIP).
- Align and resize images (default:
256×256), normalize to[0, 1]. - Organize directories as required by
data/dataloder.py(IR and VIS folders with matching filenames).
Configure the training option file to enable the hybrid loss (set G_lossfn_type: hybrid). A minimal run is launched via:
python train.py --opt options/train_hbanet.ymlKey hyperparameters:
| Parameter | Default |
|---|---|
| Optimizer | Adam (lr = 1e-4, β₁ = 0.9, β₂ = 0.999) |
| Batch size | 8–16 |
| Epochs | 100–200 |
| LR schedule | Cosine decay / StepLR |
Run inference with a trained checkpoint:
python test.py \
--model_path /path/to/checkpoint.pth \
--dataset_root ./Dataset/testsets \
--dataset MSRS \
--ir_dir IR \
--vis_dir VI \
--output_dir ./resultsThe script computes fused outputs and stores them under ./results/HBANet_<DATASET>.
Recommended quantitative metrics: Entropy (EN), Mutual Information (MI), SSIM, and Qabf. Evaluation utilities can be added in utils/ or external toolkits.
| Dataset | EN ↑ | MI ↑ | SSIM ↑ | Qabf ↑ |
|---|---|---|---|---|
| MSRS | TBD | TBD | TBD | TBD |
Numbers will be updated once public checkpoints are released.
HBANet builds upon insights from:
If our work benefits your research, please cite:
@article{LUO2024104161,
title = {HBANet: A hybrid boundary-aware attention network for infrared and visible image fusion},
journal = {Computer Vision and Image Understanding},
volume = {249},
pages = {104161},
year = {2024},
doi = {10.1016/j.cviu.2024.104161},
author = {Xubo Luo and Jinshuo Zhang and Liping Wang and Dongmei Niu}
}For questions or collaboration proposals, please open an issue or email xuboluo@bupt.edu.cn (replace with the appropriate contact).