MFF: A Deep Learning Model for Multi-Modal Image Fusion Based on Multiple Filters

Multi-modal image fusion mainly refers to the feature fusion of two or more different images taken from the same perspective range to increase the amount of information contained in an image. This study proposes a multi-modal image fusion deep network called the MFF network. Compared with traditiona...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuequn Wang, Zhengwei Li, Jianli Wang, Leqiang Yang, Bo Dong, Hanfu Zhang, Jie Liu
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10877823/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Multi-modal image fusion mainly refers to the feature fusion of two or more different images taken from the same perspective range to increase the amount of information contained in an image. This study proposes a multi-modal image fusion deep network called the MFF network. Compared with traditional image fusion models, the MFF network decomposes high-frequency features more finely. In contrast to popular transformer networks, the MFF network utilizes multiple filter networks for the corresponding high and low-frequency feature extraction, thereby improving the model training and inference time. First, GaborNet filtering modules were used by the MFF network to extract high-frequency texture features and invertible neural networks (INN) modules are employed for extracting high-frequency edge features. These two sets of features constitute the high-frequency characteristics of an image. The LEF module is utilized as a low-pass filter to acquire the low-frequency characteristics of an image. The method involving low-frequency feature correlation and high-frequency feature non-correlation was used for image training and fusion purposes. By systematically comparing the TNO, MSRS, and RoadScene datasets with other state-of-the-art image fusion models, the experimental results indicate that the MFF model achieves superior performance in visible-infrared image fusion. Furthermore, evaluations on the LLVIP dataset confirm the model’s effectiveness in downstream machine vision tasks. Additionally, comparisons using the MRI_CT, MRI_PET, and MRI_SPECT datasets demonstrate that the MFF model exhibits exceptional performance in medical image fusion.
ISSN:2169-3536