Sparse Representation-Based Multi-Focus Image Fusion Method via Local Energy in Shearlet Domain

Li, Liangliang; Lv, Ming; Jia, Zhenhong; Ma, Hongbing

doi:10.3390/s23062888

Open AccessArticle

Sparse Representation-Based Multi-Focus Image Fusion Method via Local Energy in Shearlet Domain

by

Liangliang Li

¹

,

Ming Lv

²,

Zhenhong Jia

² and

Hongbing Ma

^1,*

¹

Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

²

College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(6), 2888; https://0-doi-org.brum.beds.ac.uk/10.3390/s23062888

Submission received: 14 February 2023 / Revised: 1 March 2023 / Accepted: 3 March 2023 / Published: 7 March 2023

(This article belongs to the Section Sensing and Imaging)

Download

Browse Figures

Versions Notes

Abstract

:

Multi-focus image fusion plays an important role in the application of computer vision. In the process of image fusion, there may be blurring and information loss, so it is our goal to obtain high-definition and information-rich fusion images. In this paper, a novel multi-focus image fusion method via local energy and sparse representation in the shearlet domain is proposed. The source images are decomposed into low- and high-frequency sub-bands according to the shearlet transform. The low-frequency sub-bands are fused by sparse representation, and the high-frequency sub-bands are fused by local energy. The inverse shearlet transform is used to reconstruct the fused image. The Lytro dataset with 20 pairs of images is used to verify the proposed method, and 8 state-of-the-art fusion methods and 8 metrics are used for comparison. According to the experimental results, our method can generate good performance for multi-focus image fusion.

Keywords:

multi-focus image; image fusion; local energy; sparse representation; shearlet

1. Introduction

Due to the limited depth of field of the optical lens, the imaging device sometimes cannot achieve clear focus imaging of all objects or areas in the same scene, resulting in defocus and blurring of the scene content outside the depth of field [1,2,3,4,5]. In order to solve the above problems, multi-focus image fusion technology provides an effective way to synthesize the complementary information contained in multiple partially focused images in the same scene, and then generate an all-in-focus fusion image, which is more suitable for human observation or computer processing, and has wide application value in digital photography, microscopic imaging, holographic imaging, integrated imaging, and other fields [6,7,8,9,10,11,12,13,14,15].

Now, many multi-focus image fusion methods have been proposed. Especially, the methods based on multi-scale transform, sparse representation, edge-preserving filtering, and deep learning have achieved remarkable results in image fusion [16]. The curvelet [17], surfacelet [18], contourlet [19,20], and shearlet transforms [21,22,23] are widely used in multi-scale transform fields. Vishwakarma et al. [24] introduced the multi-focus image fusion algorithm via curvelet transform and the Karhunen–Loève Transform (KLT), and this method can achieve fused images with less noise and improve the information interpretation capability of the fused images. Yang et al. [25] proposed the multi-focus image fusion method using a pulse-coupled neural network (PCNN) and sum-modified-Laplacian algorithms in the fast discrete curvelet transform domain. Zhang et al. [26] proposed a multi-focus image fusion technique using a compound pulse-coupled neural network in a surfacelet domain, with a local sum-modified-Laplacian algorithm used as the external stimulus of the compound PCNN, and the results show that this method can achieve a good performance for multi-focus image fusion. Li et al. [27] introduced multi-focus image fusion utilizing dynamic threshold neural P systems and a surfacelet transform, and the sum-modified- Laplacian algorithm and spatial frequency are regarded as the external inputs of dynamic threshold neural P systems for low- and high-frequency coefficients, respectively; consistent verification is used to obtain the final multi-focus fused image, and this method can solve the problem of artifacts. Xu et al. [28] introduced an image fusion utilizing an enhanced cross-visual cortex model based on artificial selection and an impulse-coupled neural network in a nonsubsampled contourlet transform domain. This method can achieve outstanding edge information, high contrast, and brightness. Das et al. [29] introduced a fuzzy-adaptive reduced pulse-coupled neural network for image fusion in a nonsubsampled contourlet transform domain, and this method can generate a fused image with higher contrast than other state-of-the-art image fusion algorithms. Li et al. [30] introduced the multi-focus image fusion framework using multi-scale transform decomposition, where the nonsubsampled contourlet transform is used to obtain the basic fused image, and the energy of gradient of difference images is used to refine the basic fused image by integrating the average filter and median filter. This method can generate a high-definition fused image. Peng et al. [31] proposed coupled neural P systems and a nonsubsampled contourlet transform for image fusion, and the quantitative and qualitative experimental results demonstrate the advantages of the fusion approach. Wang et al. [32] introduced the complex shearlet features-motivated generative adversarial network for multi-focus image fusion. Li et al. [33] proposed one multi-focus image fusion method via spatial frequency-motivated parameter-adaptive pulse-coupled neural network and an improved sum-modified- Laplacian in nonsubsampled shearlet transform domain, and visual inspection and objective evaluation verified the effectiveness of the fusion method. Amrita et al. [34] proposed an image fusion method using a water wave optimization (WWO) algorithm in a nonsubsampled shearlet transform domain, and this method can obtain good fusion results. Luo et al. [35] introduced multi-modal image fusion using a 3-D shearlet transform and T-S fuzzy reasoning, and this method can achieve good fusion results. Yin et al. [36] proposed the parameter-adaptive pulse-coupled neural network (PAPCNN)-based multi-modal image fusion method in a nonsubsampled shearlet transform domain, where the weighted local energy and weighted sum of eight- neighborhood-based modified Laplacian algorithms are used for fusing the low-frequency components, and the PAPCNN-based fusion model is used for fusing the high-frequency components; the results generate state-of-the-art performance, according to the visual perception and objective assessments.

The sparse representation-based methods have been widely used in image restoration and image fusion [37,38,39,40,41,42,43,44,45,46]. Wang et al. [47] proposed a joint patch clustering-based adaptive dictionary and sparse representation for multi-modality image fusion, where the Gaussian filter is used to separate the low- and high-components, the local energy-weighted strategy is used to fuse the low-frequency sub-bands, an over-complete adaptive learning dictionary is reconstructed by the joint patch clustering model, and a hybrid fusion rule depending on the similarity of the multi-norm of sparse representation coefficients is introduced to fuse the high-frequency sub-bands. This method has good robustness and wide application. Qin et al. [48] proposed an improved image fusion algorithm using a discrete wavelet transform and sparse representation, and this method can achieve higher contrast and more image details. Liu et al. [49] introduced an effective image fusion approach using convolutional sparse representation, and this method outperforms other image fusion algorithms in terms of visual and objective assessments. Liu et al. [50] introduced an adaptive sparse representation model for multi-focus image fusion and denoising, and this approach generates good performance, according to the visual quality and objective assessment.

Edge-preserving filtering has been widely used in image enhancement, image smoothing, image denoising, and image fusion. Especially in the field of image fusion, it has a very significant effect. Li et al. [51] introduced guided image filtering for image fusion. The base layer and detail layer are generated by guided image filtering decomposition, and the weighted average model is used as the fusion rule. This method is used in experiments on muti-spectral, multi-focus, multi-modal, and multi-exposure images for fusion, and it can obtain fast and effective fusion results. Zhang et al. [52] introduced local extreme map guided image filtering for image fusion, such as medical images, multi-focus images, infrared and visual images, and multi-exposure images, and this method can generate good performance.

Deep learning-based image fusion methods have been widely used in image processing. Zhang et al. [53] proposed an image fusion method using a convolutional neural network (IFCNN), and this method has good performance for multi-focus, infrared-visual, multi-modal medical and multi-exposure image fusion. Zhang et al. [54] introduced a fast unified image fusion network based on the proportional maintenance of gradient and intensity, and this method can generate good fusion results. Xu et al. [55] proposed the unified and unsupervised end-to-end image fusion network (U2Fusion), and this algorithm achieves better fusion effects compared to state-of-the-art fusion methods. Dong et al. [56] proposed a multi-branch multi-scale deep learning image fusion algorithm based on denseNet, and this method can achieve excellent results and keep more feature information of the source images in the fused image.

In order to generate a high-quality multi-focus fusion image, a novel image fusion framework based on sparse representation and local energy is proposed. The source images are separated into the low- and high-frequency sub-bands by shearlet transform, then the sparse representation model is used for fusing the low-frequency sub-bands, and the local energy-based fusion rule is used for fusing the high-frequency sub-bands. The inverse shearlet transform is applied to reconstruct the fused image. Experimental results show that the proposed multi-focus image fusion method can retain more source image information.

2. Related Works

2.1. Shearlet Transform

In dimension

n = 2

, the shearlet transform (ST) for the signal

f

can be defined as follows [21]:

S H_{ψ} (f) = 〈 f, ψ_{a, s, t} 〉

(1)

where

S H_{ψ} (\cdot)

shows the shearlet transform.

〈 \cdot 〉

depicts the inner product. The ST projects

f

onto the functions

ψ_{a, s, t}

at scale

a

, orientation

s

, and location

t

.

The element

ψ_{a, s, t}

is named shearlet, and it can be generated by:

ψ_{a, s, t} (x) = {| \det M_{a, s} |}^{- \frac{1}{2}} ψ (M_{a, s}^{- 1} x - t) a \in R^{+}, s \in R, t \in R^{2}

(2)

where the parameters

R^{+}

,

R

, and

R^{2}

show the positive real numbers, real numbers, and 2-dimensional real vectors, respectively.

M_{a, s}

can be computed by:

M_{a, s} = (\begin{array}{l} a \sqrt{a} s \\ 0 \sqrt{a} \end{array})

(3)

where

M_{a, s} = S_{s} A_{a}

consists of two matrixes: the shear transform matrix

S_{s}

and the anisotropic dilation matrix

A_{a}

. The corresponding equations can be computed by:

S_{s} = (\begin{array}{l} 1 s \\ 01 \end{array})

(4)

A_{a} = (\begin{array}{l} a 0 \\ 0 \sqrt{a} \end{array})

(5)

The inverse shearlet transform is computed by:

f = \int_{R^{2}} \int_{- \infty}^{\infty} \int_{0}^{\infty} 〈 f, ψ_{a, s, t} 〉 ψ_{a, s, t} \frac{d a}{a^{3}} d s d t

(6)

2.2. Sparse Representation

Sparse representation can effectively extract the essential characteristics of signals and can be represented by a linear combination of non-zero atoms in a set of dictionaries [57]. We define the signal

x \in R^{n}

and the over-complete dictionary

D \in R^{n \times m} (n < m)

. The purpose of sparse representation is to estimate the sparse vector

α \in R^{m}

with the fewest nonzero entries, such that

x \approx D α

. Suppose that

M

training patches of size

\sqrt{n} \times \sqrt{n}

are rearranged to column vectors in the

R^{n}

space, so the training database

{y_{i}}_{i = 1}^{M}

is constructed with each

y_{i} \in R^{n}

. The dictionary learning model can be depicted as follows:

\min_{D, {α_{i}}_{i = 1}^{M}} \sum_{i = 1}^{M} {‖ α_{i} ‖}_{0} s . t . {‖ y_{i} - D α_{i} ‖}_{2} < ε, i \in {1, . . ., M}

(7)

where

ε > 0

shows an error tolerance,

{α_{i}}_{i = 1}^{M}

shows the unknown sparse vectors corresponding to

{y_{i}}_{i = 1}^{M}

, and

D \in R^{n \times m}

is the unknown dictionary to be learned. Some effective models, such as MOD and K-SVD, have been introduced to deal with this question. More details can be seen in reference [57].

3. Proposed Fusion Method

The proposed image fusion algorithm mainly contains four phases: shearlet transform decomposition, low-frequency fusion, high-frequency fusion, and shearlet transform reconstructed. The schematic diagram of the proposed approach is described in Figure 1.

3.1. Shearlet Transform Decomposition

The shearlet transform decomposition performs on the two source images

{I_{A}, I_{B}}

to achieve the low-frequency components

{L_{A}, L_{B}}

and the high-frequency components

{H_{A}, H_{B}}

.

3.2. Low-Frequency Fusion

In the low-frequency component, the main energy of the image is concentrated, and the subject of the image is in the low-frequency component. In this section,

L_{A}

and

L_{B}

are merged with the sparse representation fusion method. The sliding window method is utilized to divide

L_{A}

and

L_{B}

into image patches with the size

\sqrt{n} \times \sqrt{n}

from upper left to lower right with the step length of

s

pixels. Assume that there are

T

patches depicted as

{p_{A}^{i}}_{i = 1}^{T}

and

{p_{B}^{i}}_{i = 1}^{T}

in

L_{A}

and

L_{B}

, respectively.

For each position

i

, rearrange

{p_{A}^{i}, p_{B}^{i}}

into column vectors

{v_{A}^{i}, v_{B}^{i}}

and then normalize each vector’s mean value to zero to obtain

{{\hat{V}}_{A}^{i}, {\hat{V}}_{B}^{i}}

by the following equations [57]:

{\hat{V}}_{A}^{i} = V_{A}^{i} - {\bar{v}}_{A}^{i} \cdot 1

(8)

{\hat{V}}_{B}^{i} = V_{B}^{i} - {\bar{v}}_{B}^{i} \cdot 1

(9)

where 1 shows an all-one valued

n \times 1

vector, and

{\hat{v}}_{A}^{i}

and

{\hat{v}}_{B}^{i}

are the mean values of all the elements in

V_{A}^{i}

and

V_{B}^{i}

, respectively.

For the sparse coefficient vectors

{α_{A}^{i}, α_{B}^{i}}

of

{{\hat{V}}_{A}^{i}, {\hat{V}}_{B}^{i}}

, we can compute them utilizing the orthogonal matching pursuit (OMP) technique with the following formulas:

α_{A}^{i} = \arg \min_{α} {‖ α ‖}_{0} s . t . {‖ {\hat{V}}_{A}^{i} - D α ‖}_{2} < ε

(10)

α_{B}^{i} = \arg \min_{α} {‖ α ‖}_{0} s . t . {‖ {\hat{V}}_{B}^{i} - D α ‖}_{2} < ε

(11)

where

D

denotes the learned dictionary that is trained by the K-singular value decomposition (K-SVD) method.

Then,

α_{A}^{i}

and

α_{B}^{i}

are merged with the “max-L1” rule to obtain the fused sparse vector:

α_{F}^{i} = {\begin{cases} α_{A}^{i} i f {‖ α_{A}^{i} ‖}_{1} > {‖ α_{B}^{i} ‖}_{1} \\ α_{B}^{i} e l s e \end{cases}

(12)

The fused results of

V_{A}^{i}

and

V_{B}^{i}

can be computed by the following:

V_{F}^{i} = D α_{F}^{i} + {\hat{v}}_{F}^{i} \cdot 1

(13)

where the merged mean value

{\bar{v}}_{F}^{i}

can be calculated by the following:

{\bar{v}}_{F}^{i} = {\begin{cases} {\bar{v}}_{A}^{i} i f α_{F}^{i} = α_{A}^{i} \\ {\bar{v}}_{B}^{i} e l s e \end{cases}

(14)

The above process is iterated for all the source image patches in

{p_{A}^{i}}_{i = 1}^{T}

and

{p_{B}^{i}}_{i = 1}^{T}

to obtain all the fused vectors

{V_{F}^{i}}_{i = 1}^{T}

. Let

L_{F}

denote the low-pass fused result. For each

V_{F}^{i}

, reshape it into a patch

p_{F}^{i}

and then plug

p_{F}^{i}

into its original position in

L_{F}

. As patches are overlapped, each pixel’s value in

L_{F}

is averaged over its accumulation times.

3.3. High-Frequency Fusion

The high-frequency components contain a great deal of detailed information, and the high-frequency components are fused using the local coefficient energy, which is described as follows [58]:

E (ω) = \sum_{m = 1}^{M} \sum_{n = 1}^{N} H {(m, n)}^{2} / (M \times N)

(15)

where

H (m, n)

represents the high-frequency coefficients at pixel

(m, n)

, and

ω

is a local window with size

M \times N

. Let

ω_{A} (i, j)

and

ω_{B} (i, j)

show the local windows centered at pixel

(i, j)

in

H_{A}

and

H_{B}

, respectively. The high-frequency fused result

H_{F}

is achieved by the following:

H_{F} (i, j) = {\begin{cases} H_{A} (i, j) E (ω_{A} (i, j)) > E (ω_{B} (i, j)) \\ H_{B} (i, j) e l s e \end{cases}

(16)

3.4. Shearlet Transform Reconstruction

The inverse shearlet transform is performed on

L_{F}

and

H_{F}

to reconstruct the final fused image

I_{F}

.

4. Experimental Results and Discussions

In this section, 20 pairs of multi-focus images from the Lytro dataset [59] (Figure 2) are selected to experiment with the subjective and objective evaluation metrics to demonstrate the effectiveness of the proposed multi-focus image fusion algorithm. Compared with the latest published algorithms, we can highlight the advantages of our image fusion algorithm. The eight state-of-the-art image fusion methods are selected for comparison, and the methods are nonsubsampled contourlet transform and fuzzy-adaptive reduced pulse-coupled neural network (NSCT) [29], image fusion using the curvelet transform (CVT) [57], image fusion with parameter-adaptive pulse-coupled neural network in nonsubsampled shearlet transform domain (NSST) [36], image fusion framework based on convolutional neural network (IFCNN) [53], fast unified image fusion network based on the proportional maintenance of gradient and intensity (PMGI) [54], unified unsupervised image fusion network (U2Fusion) [55], local extreme map guided multi-modal image fusion (LEGFF) [52], and zero-shot multi-focus image fusion (ZMFF) [60]. A single image fusion evaluation index cannot fully reflect the image quality, and multiple evaluation indexes can be used together to more objectively analyze the data and image information. The eight metrics are used as the objective evaluation, and the metrics are the edge-based similarity measurement

Q_{A B / F}

[61], the human perception inspired metric

Q_{C B}

[62], the structural similarity-based metric

Q_{Y}

introduced by Yang et al. [62], the structural similarity-based metric

Q_{E}

[62], the gradient-based metric

Q_{G}

[62], the nonlinear correlation information entropy

Q_{N C I E}

[62], the mutual information

Q_{M I}

[61], and the phase congruency-based metric

Q_{P}

[62]. Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 show the corresponding fusion results, and Figure 8 and Table 1, Table 2, Table 3, Table 4 and Table 5 show the corresponding metrics data. In our method, the decomposition levels of the shearlet is 4, and the direction numbers are [10, 10, 18, 18]. The dictionary size is set to 256, and the iteration number of K-SVD is fixed to 180. The patch size is 6 × 6, the step length is set to 1, and the error tolerance

ε

is set to 0.1.

Figure 3 shows the fused images of different methods on the first pair of images in Figure 2, and Table 1 shows the corresponding metrics data. The fused images generated by the NSCT, CVT, and NSST algorithms are blurred in some areas. The PMGI method generates a dark image, and it is distorted and blurred. The IFCNN, U2Fusion, LEGFF, and ZMFF methods generate higher brightness. Compared with the other fusion methods, our method has the best fusion result, and more complementary image information is retained. The enlarged area in the images allows observing some details in the fused images. From Table 1, we can see that the metrics date of

Q_{A B / F}

,

Q_{Y}

,

Q_{E}

,

Q_{G}

,

Q_{N C I E}

,

Q_{M I}

, and

Q_{P}

generated by our method are the best, and the corresponding values are 0.7446, 0.9708, 0.8868, 0.7273, 0.8243, 6.5008, and 0.7860, respectively. The ZMFF method generates the best value of

Q_{C B}

with 0.7802, and our method, which achieves the value 0.7760, is ranked second.

Figure 4 shows the fused images of different methods on the second pair of images in Figure 2, and Table 2 depicts the corresponding metrics data. The fused images generated by the NSCT, CVT, IFCNN, LEGFF, and ZMFF algorithms produce a considerable fusion effect, and the images are similar. The NSST algorithm produces clearer close-range information, while the distant information, such as the outline of the mountain, is relatively fuzzy. The PMGI algorithm produces a fuzzy fusion image, which does not achieve the effect of information complementarity, and the definition is obviously low, so it is difficult to observe the details in the image. The U2Fusion method improves the brightness of some areas of the image, such as the man’s face area, but the head, mouth, and neck areas of the man are obviously dark, so it is impossible to observe these parts of the information. Compared with the other fusion algorithms, our algorithm obtained clear close and distant information, achieved the effect of information complementarity, and maintained the image details well, and the result is easy to observe in the image. From Table 2, we can see that the metrics date of

Q_{C B}

,

Q_{Y}

, and

Q_{E}

computed by our method are the best, with the corresponding values 0.6924, 0.9593, and 0.8684, respectively.

Figure 5 shows the fused images of different methods on the third pair of images in Figure 2, and Table 3 shows the corresponding metrics data. The fused images generated by the NSCT and NSST algorithms are blurred in the girl’s face area. The CVT, IFCNN, LEGFF, and ZMFF methods generate all-focus images. The PMGI approach generates a distorted and blurred fusion image, making it impossible to obtain details in the images. Some areas in the fused image acquired by the U2Fusion method are very dark, such as the collar of the boys and girls, the tongue and hair of the boys, and the leaves. Our algorithm obtains a full-focus image, and the details of the source images are preserved well. From Table 3, we can see that the metrics date of

Q_{A B / F}

,

Q_{Y}

,

Q_{E}

,

Q_{G}

, and

Q_{P}

generated by our method are the best, with the corresponding values 0.7134, 0.9589, 0.8710, 0.7139, and 0.8194, respectively.

Figure 6 shows the fused images of different methods on the third pair of images in Figure 2, and Table 4 shows the corresponding metrics data. The fused images generated by the NSCT, CVT, IFCNN, LEGFF, and ZMFF algorithms produce basic full-focus images. The NSST method has a blurred image, such as the contour information of the woman in the distance. The PMGI method produces a completely blurred effect, and it is dark. The U2Fusion method makes some areas too bright and some areas too dark, and does not achieve an effect of moderate brightness. Our method produces a clear full-focus image, and the information complementation achieves an optimal effect. From Table 4, we can see that the metrics date of

Q_{A B / F}

,

Q_{C B}

,

Q_{Y}

,

Q_{E}

,

Q_{G}

, and

Q_{P}

generated by our method are the best, with the corresponding values 0.7148, 0.7301, 0.9584, 0.8691, 0.7162, and 0.8249, respectively.

Figure 7 shows the fused results of different methods on other images in Figure 2, and we can compare the fusion effect of different algorithms. Figure 8 shows the line chart of the metrics data with different methods in Figure 2, and we can observe the fluctuation of corresponding index values obtained by the different algorithms on the 20 groups of multi-focus images. The average metrics data of the different methods in Figure 8 are shown in Table 5, and from this table, we can notice that the metrics data

Q_{A B / F}

,

Q_{C B}

,

Q_{Y}

,

Q_{E}

,

Q_{G}

, and

Q_{N C I E}

generated by the proposed method are the best. The values of

Q_{M I}

and

Q_{P}

generated by the IFCNN method are the best; however, the two corresponding index values

Q_{M I}

and

Q_{P}

obtained by our algorithm still rank second among all the algorithms and have obvious advantages. Through qualitative and quantitative evaluation and analysis, our algorithm achieves the best multi-focus image fusion effect.

5. Conclusions

In order to generate a clear full-focus image, a novel multi-focus image fusion method based on sparse representation and local energy in shearlet domain is introduced. The shearlet transform is utilized to decompose the source images into low- and high-frequency sub-bands; the sparse representation based fusion rule is used to fuse the low-frequency sub-band, and local energy based fusion rule is used to fuse the high-frequency sub-bands. Twenty groups of multi-focus images are tested, and the effectiveness of the algorithm proposed in this paper is verified through qualitative and quantitative evaluation and analysis. The average metrics data

Q_{A B / F}

,

Q_{C B}

,

Q_{Y}

,

Q_{E}

,

Q_{G}

, and

Q_{N C I E}

computed by our method are the best, and the corresponding values are 0.7343, 0.7436, 0.9538, 0.8808, 0.7317, and 0.8299, respectively; the values of

Q_{M I}

and

Q_{P}

also generate relatively advanced data. In the future work, we will extend this algorithm to multi-exposure image fusion and other multi-modal image fusion.

Author Contributions

The experimental measurements and data collection were carried out by L.L. and H.M. The manuscript was written by L.L. with the assistance of M.L., Z.J. and H.M. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Cross-Media Intelligent Technology Project of Beijing National Research Center for Information Science and Technology (BNRist) under Grant No. BNR2019TD01022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Vasu, G.T.; Palanisamy, P. Gradient-based multi-focus image fusion using foreground and background pattern recognition with weighted anisotropic diffusion filter. Signal Image Video Process. 2023. [Google Scholar] [CrossRef]
Li, H.; Qian, W. Siamese conditional generative adversarial network for multi-focus image fusion. Appl. Intell. 2023. [Google Scholar] [CrossRef]
Li, X.; Wang, X. Multi-focus image fusion based on Hessian matrix decomposition and salient difference focus detection. Entropy 2022, 24, 1527. [Google Scholar] [CrossRef] [PubMed]
Jiang, L.; Fan, H. Multi-level receptive field feature reuse for multi-focus image fusion. Mach. Vis. Appl. 2022, 33, 92. [Google Scholar] [CrossRef]
Mohan, C.; Chouhan, K. Improved procedure for multi-focus images using image fusion with qshiftN DTCWT and MPCA in Laplacian pyramid domain. Appl. Sci. 2022, 12, 9495. [Google Scholar] [CrossRef]
Zhang, X.; He, H.; Zhang, J. Multi-focus image fusion based on fractional order differentiation and closed image matting. ISA Trans. 2022, 129, 703–714. [Google Scholar] [CrossRef]
Zhang, X. Deep learning-based multi-focus image fusion: A survey and a comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 4819–4838. [Google Scholar] [CrossRef]
Liu, Y.; Wang, L. Multi-focus image fusion with deep residual learning and focus property detection. Inf. Fusion 2022, 86–87, 1–16. [Google Scholar] [CrossRef]
Wang, Z.; Li, X. A self-supervised residual feature learning model for multifocus image fusion. IEEE Trans. Image Process. 2022, 31, 4527–4542. [Google Scholar] [CrossRef] [PubMed]
Aymaz, S.; Kose, C.; Aymaz, S. A novel approach with the dynamic decision mechanism (DDM) in multi-focus image fusion. Multimed. Tools Appl. 2023, 82, 1821–1871. [Google Scholar] [CrossRef]
Luo, H.; U, K.; Zhao, W. Multi-focus image fusion through pixel-wise voting and morphology. Multimed. Tools Appl. 2023, 82, 899–925. [Google Scholar] [CrossRef]
Jiang, L.; Fan, H.; Li, J. DDFN: A depth-differential fusion network for multi-focus image. Multimed. Tools Appl. 2022, 81, 43013–43036. [Google Scholar] [CrossRef]
Li, L.; Ma, H. Pulse coupled neural network-based multimodal medical image fusion via guided filtering and WSEML in NSCT domain. Entropy 2021, 23, 591. [Google Scholar] [CrossRef] [PubMed]
Li, L.; Ma, H. Saliency-guided nonsubsampled shearlet transform for multisource remote sensing image fusion. Sensors 2021, 21, 1756. [Google Scholar] [CrossRef]
Xiao, Y.; Guo, Z.; Veelaert, P.; Philips, W. General image fusion for an arbitrary number of inputs using convolutional neural networks. Sensors 2022, 22, 2457. [Google Scholar] [CrossRef]
Karim, S.; Tong, G. Current advances and future perspectives of image fusion: A comprehensive review. Inf. Fusion 2023, 90, 185–217. [Google Scholar] [CrossRef]
Candes, E.; Demanet, L. Fast discrete curvelet transforms. Multiscale Model. Simul. 2006, 5, 861–899. [Google Scholar] [CrossRef] [Green Version]
Lu, Y.M.; Do, M.N. Multidimensional directional filter banks and surfacelets. IEEE Trans. Image Process. 2007, 16, 918–931. [Google Scholar] [CrossRef]
Do, M.N.; Vetterli, M. The contourlet transform: An efficient directional multiresolution image representation. IEEE Trans. Image Process. 2005, 14, 2091–2106. [Google Scholar] [CrossRef] [Green Version]
Da, A.; Zhou, J.; Do, M. The nonsubsampled contourlet transform: Theory, design, and applications. IEEE Trans. Image Process. 2006, 15, 3089–3101. [Google Scholar]
Guo, K.; Labate, D. Optimally sparse multidimensional representation using shearlets. SIAM J. Math. Anal. 2007, 39, 298–318. [Google Scholar] [CrossRef] [Green Version]
Easley, G.; Labate, D.; Lim, W.Q. Sparse directional image representations using the discrete shearlet transform. Appl. Comput. Harmon. Anal. 2008, 25, 25–46. [Google Scholar] [CrossRef] [Green Version]
Vishwakarma, A.; Bhuyan, M.K. Image fusion using adjustable non-subsampled shearlet transform. IEEE Trans. Instrum. Meas. 2019, 68, 3367–3378. [Google Scholar] [CrossRef]
Vishwakarma, A.; Bhuyan, M. A curvelet-based multi-sensor image denoising for KLT-based image fusion. Multimed. Tools Appl. 2022, 81, 4991–5016. [Google Scholar] [CrossRef]
Yang, Y.; Tong, S. A hybrid method for multi-focus image fusion based on fast discrete curvelet transform. IEEE Access 2017, 5, 14898–14913. [Google Scholar] [CrossRef]
Zhang, B.; Zhang, C.; Liu, Y. Multi-focus image fusion algorithm based on compound PCNN in Surfacelet domain. Optik 2014, 125, 296–300. [Google Scholar] [CrossRef]
Li, B.; Peng, H. Multi-focus image fusion based on dynamic threshold neural P systems and surfacelet transform. Knowl.-Based Syst. 2020, 196, 105794. [Google Scholar] [CrossRef]
Xu, W.; Fu, Y. Medical image fusion using enhanced cross-visual cortex model based on artificial selection and impulse-coupled neural network. Comput. Methods Programs Biomed. 2023, 229, 107304. [Google Scholar] [CrossRef] [PubMed]
Das, S.; Kundu, M.K. A neuro-fuzzy approach for medical image fusion. IEEE Trans. Biomed. Eng. 2013, 60, 3347–3353. [Google Scholar] [CrossRef]
Li, L.; Ma, H.; Jia, Z.; Si, Y. A novel multiscale transform decomposition based multi-focus image fusion framework. Multimed. Tools Appl. 2021, 80, 12389–12409. [Google Scholar] [CrossRef]
Peng, H.; Li, B. Multi-focus image fusion approach based on CNP systems in NSCT domain. Comput. Vis. Image Underst. 2021, 210, 103228. [Google Scholar] [CrossRef]
Wang, L.; Liu, Z. The fusion of multi-focus images based on the complex shearlet features-motivated generative adversarial network. J. Adv. Transp. 2021, 2021, 5439935. [Google Scholar] [CrossRef]
Li, L.; Si, Y.; Wang, L.; Jia, Z.; Ma, H. A novel approach for multi-focus image fusion based on SF-PAPCNN and ISML in NSST domain. Multimed. Tools Appl. 2020, 79, 24303–24328. [Google Scholar] [CrossRef]
Amrita, S.; Joshi, S. Water wave optimized nonsubsampled shearlet transformation technique for multimodal medical image fusion. Concurr. Comput. Pract. Exp. 2023, 35, e7591. [Google Scholar] [CrossRef]
Luo, X.; Xi, X. Multimodal medical volumetric image fusion using 3-D shearlet transform and T-S fuzzy reasoning. Multimed. Tools Appl. 2022, 1–36. [Google Scholar] [CrossRef]
Yin, M.; Liu, X.; Liu, Y. Medical image fusion with parameter-adaptive pulse coupled neural network in nonsubsampled shearlet transform domain. IEEE Trans. Instrum. Meas. 2019, 68, 49–64. [Google Scholar] [CrossRef]
Zha, Z.; Wen, B. Learning nonlocal sparse and low-rank models for image compressive sensing: Nonlocal sparse and low-rank modeling. IEEE Signal Process. Mag. 2023, 40, 32–44. [Google Scholar] [CrossRef]
Zha, Z.; Yuan, X. From rank estimation to rank approximation: Rank residual constraint for image restoration. IEEE Trans. Image Process. 2020, 29, 3254–3269. [Google Scholar] [CrossRef] [Green Version]
Zha, Z.; Yuan, X. Image restoration via simultaneous nonlocal self-similarity priors. IEEE Trans. Image Process. 2020, 29, 8561–8576. [Google Scholar] [CrossRef]
Zha, Z.; Yuan, X. Image restoration using joint patch-group-based sparse representation. IEEE Trans. Image Process. 2020, 29, 7735–7750. [Google Scholar] [CrossRef]
Zha, Z.; Yuan, X. A benchmark for sparse coding: When group sparsity meets rank minimization. IEEE Trans. Image Process. 2020, 29, 5094–5109. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zha, Z.; Yuan, X. Group sparsity residual constraint with non-local priors for image restoration. IEEE Trans. Image Process. 2020, 29, 8960–8975. [Google Scholar] [CrossRef] [PubMed]
Zha, Z.; Wen, B. Image restoration via reconciliation of group sparsity and low-rank models. IEEE Trans. Image Process. 2021, 30, 5223–5238. [Google Scholar] [CrossRef]
Zha, Z.; Wen, B. A hybrid structural sparsification error model for image restoration. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 4451–4465. [Google Scholar] [CrossRef] [PubMed]
Zha, Z.; Wen, B. Triply complementary priors for image restoration. IEEE Trans. Image Process. 2021, 30, 5819–5834. [Google Scholar] [CrossRef]
Zha, Z.; Wen, B. Low-rankness guided group sparse representation for image restoration. IEEE Trans. Neural Netw. Learn. Syst. 2022, 1–15. [Google Scholar] [CrossRef]
Wang, C.; Wu, Y. Joint patch clustering-based adaptive dictionary and sparse representation for multi-modality image fusion. Mach. Vis. Appl. 2022, 33, 69. [Google Scholar] [CrossRef]
Qin, X.; Ban, Y.; Wu, P. Improved image fusion method based on sparse decomposition. Electronics 2022, 11, 2321. [Google Scholar] [CrossRef]
Liu, Y.; Chen, X. Image fusion with convolutional sparse representation. IEEE Signal Process. Lett. 2016, 23, 1882–1886. [Google Scholar] [CrossRef]
Liu, Y.; Wang, Z. Simultaneous image fusion and denoising with adaptive sparse representation. IET Image Process. 2015, 9, 347–357. [Google Scholar] [CrossRef] [Green Version]
Li, S.; Kang, X.; Hu, J. Image fusion with guided filtering. IEEE Trans. Image Process. 2013, 22, 2864–2875. [Google Scholar] [PubMed]
Zhang, Y.; Xiang, W.; Zhang, S. Local extreme map guided multi-modal brain image fusion. Front. Neurosci. 2022, 16, 1055451. [Google Scholar] [CrossRef]
Zhang, Y.; Liu, Y.; Sun, P. IFCNN: A general image fusion framework based on convolutional neural network. Inf. Fusion 2020, 54, 99–118. [Google Scholar] [CrossRef]
Zhang, H.; Xu, H.; Xiao, Y. Rethinking the image fusion: A fast unified image fusion network based on proportional maintenance of gradient and intensity. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12797–12804. [Google Scholar]
Xu, H.; Ma, J.; Jiang, J. U2Fusion: A unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 502–518. [Google Scholar] [CrossRef]
Dong, Y.; Chen, Z.; Li, Z.; Gao, F. A multi-branch multi-scale deep learning image fusion algorithm based on DenseNet. Appl. Sci.-Basel 2022, 12, 10989. [Google Scholar] [CrossRef]
Liu, Y.; Liu, S.; Wang, Z. A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fusion 2015, 24, 147–164. [Google Scholar] [CrossRef]
Liu, Y.; Wang, Z. A practical pan-sharpening method with wavelet transform and sparse representation. In Proceedings of the IEEE International Conference on Imaging Systems and Techniques (IST), Beijing, China, 22–23 October 2013; pp. 288–293. [Google Scholar]
Nejati, M.; Samavi, S.; Shirani, S. Multi-focus image fusion using dictionary-based sparse representation. Inf. Fusion 2015, 25, 72–84. [Google Scholar] [CrossRef]
Hu, X.; Jiang, J.; Liu, X.; Ma, J. ZMFF: Zero-shot multi-focus image fusion. Inf. Fusion 2023, 92, 127–138. [Google Scholar] [CrossRef]
Qu, X.; Yan, J.; Xiao, H. Image fusion algorithm based on spatial frequency-motivated pulse coupled neural networks in nonsubsampled contourlet transform domain. Acta Autom. Sin. 2008, 34, 1508–1514. [Google Scholar] [CrossRef]
Liu, Z.; Blasch, E.; Xue, Z. Objective assessment of multiresolution image fusion algorithms for context enhancement in night vision: A comparative study. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 94–109. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Diagram of the proposed fusion method.

Figure 2. Lytro dataset.

Figure 3. Fusion results on the first pair of images. (a) NSCT; (b) CVT; (c) NSST; (d) IFCNN; (e) PMGI; (f) U2Fusion; (g) LEGFF; (h) ZMFF; (i) Proposed.

Figure 4. Fusion results on the second pair of images. (a) NSCT; (b) CVT; (c) NSST; (d) IFCNN; (e) PMGI; (f) U2Fusion; (g) LEGFF; (h) ZMFF; (i) Proposed.

Figure 5. Fusion results on the third pair of images. (a) NSCT; (b) CVT; (c) NSST; (d) IFCNN; (e) PMGI; (f) U2Fusion; (g) LEGFF; (h) ZMFF; (i) Proposed.

Figure 6. Fusion results on the fourth pair of images. (a) NSCT; (b) CVT; (c) NSST; (d) IFCNN; (e) PMGI; (f) U2Fusion; (g) LEGFF; (h) ZMFF; (i) Proposed.

Figure 7. Fusion results on other images in Figure 2.

Figure 8. Line chart of metrics data with different methods in Figure 2.

Table 1. Objective evaluation of methods in Figure 3.

	Q_AB/F	Q_CB	Q_Y	Q_E	Q_G	Q_NCIE	Q_MI	Q_P
NSCT	0.7092	0.7108	0.9248	0.8798	0.6831	0.8216	6.0272	0.6575
CVT	0.7373	0.7576	0.9652	0.8825	0.7191	0.8225	6.1919	0.7694
NSST	0.6526	0.7195	0.8680	0.8543	0.6167	0.8180	5.3646	0.4842
IFCNN	0.7412	0.7622	0.9666	0.8848	0.7207	0.8234	6.3637	0.7727
PMGI	0.5466	0.6070	0.7656	0.6316	0.5156	0.8169	5.1347	0.3925
U2Fusion	0.6575	0.6164	0.8832	0.7952	0.6338	0.8176	5.2894	0.6640
LEGFF	0.6923	0.6857	0.9164	0.8205	0.6658	0.8158	4.8919	0.6937
ZMFF	0.7342	0.7802	0.9644	0.8779	0.7134	0.8222	6.1505	0.7673
Proposed	0.7446	0.7760	0.9708	0.8868	0.7273	0.8243	6.5008	0.7860

Table 2. Objective evaluation of methods in Figure 4.

	Q_AB/F	Q_CB	Q_Y	Q_E	Q_G	Q_NCIE	Q_MI	Q_P
NSCT	0.7276	0.6578	0.9363	0.8598	0.7110	0.8337	7.5353	0.8483
CVT	0.7411	0.6838	0.9561	0.8661	0.7332	0.8333	7.5911	0.8879
NSST	0.6934	0.6588	0.9376	0.8391	0.6667	0.8308	7.1464	0.7872
IFCNN	0.7315	0.6825	0.9349	0.8663	0.7205	0.8324	7.4651	0.8744
PMGI	0.4798	0.5977	0.7132	0.5816	0.4592	0.8251	6.3071	0.4944
U2Fusion	0.5951	0.4969	0.6918	0.6838	0.5786	0.8242	6.1325	0.7393
LEGFF	0.6770	0.6466	0.8394	0.7920	0.6603	0.8225	5.8173	0.8132
ZMFF	0.7085	0.6544	0.9171	0.8568	0.6927	0.8297	7.0711	0.8365
Proposed	0.7404	0.6924	0.9593	0.8684	0.7326	0.8332	7.5773	0.8860

Table 3. Objective evaluation of methods in Figure 5.

	Q_AB/F	Q_CB	Q_Y	Q_E	Q_G	Q_NCIE	Q_MI	Q_P
NSCT	0.6800	0.6636	0.9272	0.8345	0.6807	0.8239	6.2537	0.7682
CVT	0.7008	0.7053	0.9498	0.8626	0.7020	0.8213	5.9563	0.7959
NSST	0.6359	0.6789	0.9138	0.7973	0.6372	0.8201	5.7004	0.6903
IFCNN	0.7060	0.7047	0.9509	0.8676	0.7047	0.8241	6.4525	0.8134
PMGI	0.4143	0.5467	0.7409	0.5178	0.4139	0.8198	5.6679	0.4286
U2Fusion	0.6059	0.5849	0.7903	0.7988	0.6064	0.8193	5.5648	0.6591
LEGFF	0.6764	0.7090	0.9104	0.8557	0.6757	0.8198	5.6745	0.7630
ZMFF	0.6898	0.7406	0.9408	0.8563	0.6890	0.8234	6.3327	0.7934
Proposed	0.7134	0.7222	0.9589	0.8710	0.7139	0.8231	6.2696	0.8194

Table 4. Objective evaluation of methods in Figure 6.

	Q_AB/F	Q_CB	Q_Y	Q_E	Q_G	Q_NCIE	Q_MI	Q_P
NSCT	0.6961	0.6866	0.9407	0.8496	0.6975	0.8363	7.7208	0.7916
CVT	0.7125	0.7240	0.9515	0.8656	0.7114	0.8343	7.5880	0.8219
NSST	0.5955	0.6809	0.8837	0.7067	0.5944	0.8308	7.0354	0.6179
IFCNN	0.7103	0.7098	0.9399	0.8679	0.7112	0.8364	7.8860	0.8101
PMGI	0.3491	0.5517	0.6784	0.3959	0.3491	0.8285	6.7140	0.3640
U2Fusion	0.5988	0.5576	0.7763	0.7853	0.5985	0.8282	6.6513	0.6573
LEGFF	0.6639	0.6739	0.8700	0.8327	0.6649	0.8279	6.5996	0.7240
ZMFF	0.6780	0.7229	0.9196	0.8539	0.6774	0.8340	7.5359	0.7762
Proposed	0.7148	0.7301	0.9584	0.8691	0.7162	0.8363	7.8484	0.8249

Table 5. Average metrics data of different methods in Figure 8.

	Q_AB/F	Q_CB	Q_Y	Q_E	Q_G	Q_NCIE	Q_MI	Q_P
NSCT	0.7103	0.6799	0.9208	0.8644	0.7058	0.8280	6.7075	0.7616
CVT	0.7292	0.7265	0.9434	0.8764	0.7257	0.8281	6.7485	0.7985
NSST	0.6720	0.6895	0.8955	0.8247	0.6655	0.8254	6.3212	0.6932
IFCNN	0.7337	0.7292	0.9519	0.8792	0.7297	0.8298	7.0353	0.8178
PMGI	0.3901	0.5656	0.6738	0.4736	0.3857	0.8225	5.8641	0.4620
U2Fusion	0.6143	0.5682	0.7912	0.7835	0.6093	0.8221	5.7765	0.6657
LEGFF	0.6810	0.6751	0.8817	0.8195	0.6754	0.8214	5.6138	0.7565
ZMFF	0.7087	0.7412	0.9313	0.8687	0.7030	0.8271	6.6271	0.7853
Proposed	0.7343	0.7436	0.9538	0.8808	0.7317	0.8299	7.0260	0.8076

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, L.; Lv, M.; Jia, Z.; Ma, H. Sparse Representation-Based Multi-Focus Image Fusion Method via Local Energy in Shearlet Domain. Sensors 2023, 23, 2888. https://0-doi-org.brum.beds.ac.uk/10.3390/s23062888

AMA Style

Li L, Lv M, Jia Z, Ma H. Sparse Representation-Based Multi-Focus Image Fusion Method via Local Energy in Shearlet Domain. Sensors. 2023; 23(6):2888. https://0-doi-org.brum.beds.ac.uk/10.3390/s23062888

Chicago/Turabian Style

Li, Liangliang, Ming Lv, Zhenhong Jia, and Hongbing Ma. 2023. "Sparse Representation-Based Multi-Focus Image Fusion Method via Local Energy in Shearlet Domain" Sensors 23, no. 6: 2888. https://0-doi-org.brum.beds.ac.uk/10.3390/s23062888

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sparse Representation-Based Multi-Focus Image Fusion Method via Local Energy in Shearlet Domain

Abstract

1. Introduction

2. Related Works

2.1. Shearlet Transform

2.2. Sparse Representation

3. Proposed Fusion Method

3.1. Shearlet Transform Decomposition

3.2. Low-Frequency Fusion

3.3. High-Frequency Fusion

3.4. Shearlet Transform Reconstruction

4. Experimental Results and Discussions

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI