ZhaoRenbi1*
-
( School of Management, Guangdong Nanfang Institute of Technology, Jiangmen, 529000,
China RenbiZhao@outlook.com)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Keywords
Deep residual shrinkage network, Travel, Image retrieval, Attention mechanism
1. Introduction
With the rapid development of the Internet, people have generated a large amount of
image data during the tourism process. Using these image data for tourism image retrieval
can provide users with accurate and efficient information retrieval services, helping
them better choose tourism destinations, plan tourism routes, etc. However, due to
the particularity of image data, tourism image retrieval faces some challenges, such
as high-dimensional features of images, richness of image semantics, and similarity
measurement between images [1,2]. To address these issues, research is based on the deep residual shrinkage network
(DRSN), which is improved and then applied to tourism image retrieval. DRSN is an
image feature extraction network based on deep learning, which has good feature learning
and generalization capabilities [3,4]. The study first utilizes deactivation mechanisms and activation functions to improve
the DRSN network, enabling it to extract more image features. The study introduces
a global average pooling layer and a novel activation function in DRSN networks to
enhance the network's learning ability and generalization performance. Then, the attention
mechanism will be incorporated into the improved DRSN network. The attention mechanism
can enable the network to automatically learn the weights of different image features
and focus more on the features related to the query image [5,6]. By improving the DRSN network, the research aims to enhance the representation and
retrieval performance of tourism image features. This study aims to provide a new
and effective solution for tourism image retrieval to meet the needs of practical
applications. The innovation of the research lies in the introduction of attention
mechanism to enhance the ability of DRSN to capture semantic information of images.
The backbone extraction module is responsible for extracting semantic information
features, and the branch mask extraction module is responsible for feature selection,
thereby improving the image retrieval capability of the model.
The first part of the study utilizes deactivation mechanisms and activation functions
to improve the deep residual contraction network. The second part utilizes attention
mechanism and improves deep residual shrinkage network to construct a tourism image
retrieval model. The third part verifies the performance of the constructed model
for comment classification through simulation experiments and practical applications.
The fourth part summarizes the experimental results and analyzes the advantages and
disadvantages of the research methods used.
2. Related Works
With the rapid development and popularization of internet technology, people's demand
for tourism information is also growing day by day. Tourism images, as an intuitive
and rich information carrier, play an important role in tourism information retrieval.
However, how to efficiently retrieve and manage a large amount of tourism image data
remains a challenge. In recent years, the rapid development of deep learning technology
has provided new solutions for image processing, and numerous experts and scholars
have conducted relevant research. Scholars such as Su have proposed a marketing method
based on tourism image retrieval to enhance the development of ecotourism. This method
first collected tourist information using questionnaires, and then processed the images
of the involved scenic spots using the collected information. The results showed that
after processing the image using this method, the interest of tourists in ecotourism
could be significantly increased [7]. Maree M et al. designed a mobile recommendation system for precise multilingual
and multi standard semantics to enhance the retrieval experience of tourists towards
tourism services. The system could provide users with various image search functions
and utilize a database to detect recommendation systems. The results indicated that
the system could start finding corresponding image information based on the questions
of tourists [8]. Zheng et al. conducted a study on the emotional connection between tourism purpose
and purchase intention, based on the coupling theory, to examine the image of tourism
destinations. By conducting relevant searches on tourism purposes, it was possible
to generate certain expectations for tourists towards the destination property. The
study analyzed this through coupling theory. The results indicated a clear coordination
relationship between the evaluation values of tourism images and tourism figures [9]. Ageeva E has established a conceptual model based on tourism images and tourism
behavior to enhance the development of tourism. This model would analyze the image
of local brands in tourist destinations and the supply and demand of tourists to find
the correlation between the two. The results indicated that the model could provide
good planning for the appearance of tourists [10].
To conduct research on image restoration, scholars such as Zhou D applied cyclic consistent
generative adversarial networks to image enhancement processing and established a
network model. This model utilized three paths to extract features, thereby solving
the problems of image color difference, features, and discrimination. The results
indicated that the image quality of the model has been significantly improved [11]. Yan et al. designed a remote sensing imaging coupled typical error source inversion
method to improve the imaging quality of optical systems. This method could effectively
process images using the modulation transfer function model and the decoupling principle
of coupling error sources. The results showed that the maximum relative error between
the inversion values of distorted remote sensing images coupled with typical error
sources and the true values was not more than 20%, and most of them were below 10%,
indicating good inversion performance [12]. To improve image denoising techniques, scholars such as Thakur RS have designed
a model based on Markov models combined with convolutional neural networks (CNN).
This model could effectively and quickly utilize deep shrinkage to process image noise.
The results indicated that the performance of these CNN models was analyzed on the
BSD-68 and Set-12 datasets. PDNN showed the best results in PSNR for the BSD-68 and
Set-12 datasets [13].
In summary, the research on tourism image retrieval based on attention mechanism improved
deep residual shrinkage networks is of great significance. By integrating attention
mechanism with DRSN network, effective analysis and processing of tourism image retrieval
can be carried out to obtain the results of tourism image retrieval. The research
aims to provide stronger support for tourism image retrieval and the development of
the tourism industry.
3. Construction of a Tourism Image Retrieval Model Based on Improved DRSN Network
DRSN is a network with residual structure and self attention mechanism, which has
shown excellent performance in image feature extraction and semantic understanding.
However, there are still some issues with existing DRSNs, such as weak robustness
to noise and interference, and limited feature extraction capabilities. Therefore,
a tourism image retrieval model based on an improved DRSN network was constructed
to improve the performance and efficiency of tourism image retrieval.
3.1 Image Feature Extraction Based on Improved DRSN
With the rapid development of the Internet, people's willingness to obtain information
about tourist destinations through the Internet is becoming increasingly strong. The
display of tourist maps can provide information reference for tourists with travel
plans, thereby promoting the development of the tourism industry. But tourism images
also contain a lot of information, in addition to providing scenic spot information,
they also contain a lot of irrelevant information, which can interfere with the accuracy
of tourism image retrieval. Therefore, traditional DRSN designs excessively deep network
layers to filter out irrelevant information and add features of useful information.
Although this can to some extent reduce the feature extraction of irrelevant information,
it can also affect the performance of neural networks, leading to the disappearance
of network gradients and multi degree fitting, thereby affecting the accuracy of feature
extraction [14,15,16]. Based on this, the study utilizes random deactivation mechanism and activation function
to improve DRSN for image feature extraction. The random deactivation mechanism is
a strategy used in training neural networks, which reduces overfitting by randomly
deleting the outputs of some nodes during the training process. Specifically, for
each layer of neural network, a portion of nodes are retained and the remaining nodes
are deleted. During the iteration, after being processed by the deactivation mechanism,
the DRSN network structure underwent significant changes, and all network structures
can be treated as independent networks. The output result of DRSN can be regarded
as the sum of all independent network prediction results. The deactivation mechanism
is based on this to reduce the occurrence of overfitting and improve the robustness
of the network. The flowchart of deactivation mechanism processing is shown in Fig. 1.
Fig. 1. Flowchart for handling deactivation mechanisms.
After using the deactivation mechanism to overfit the DRSN network, the activation
function can be used to further improve the DRSN network. To determine the effects
of different activation functions, two types of activation functions are selected
for comparison, namely parameter modified linear unit activation function and exponential
linear function. The definition of the parameter correction linear unit activation
function can be represented by formula (1).
In formula (1), $x_{i} $ represents the input of the $i$th channel in the parameter correction linear
unit activation function. $a_{i} $ represents the parameter value of controllable
negative half axis slope. The parameter correction linear unit activation function
can correct the system by adding a small number of parameters, thereby reducing the
risk of overfitting during the network fitting. Exponential linear functions can be
defined using formula (2).
In formula (2), $x_{j} $ represents the input of the $j$th channel in the exponential linear activation
function. By comparing formula (1) with formula (2), it is found that the positive and negative interval characteristics in exponential
linear activation functions have a wider adaptability. In positive intervals, it has
unsaturated characteristics, while in negative intervals, the exponential linear activation
function can take negative values. This indicates that when the output mean of the
activation function is around 0, the robustness of the model to noise does not decrease.
From this, the exponential linear activation function is more suitable for image feature
extraction than the parameter modified linear unit activation function. After determining
the activation function, the residual network of the DRSN network is superimposed.
After adding the activation function, there is a partial loss of irreversible information
in both the input and output processes of the entire network. If the lost information
contains some feature information, it will affect the performance of the network model.
Therefore, research is conducted to ensure that the deep and shallow networks in DRSN
networks are the same through identity functions, and residual results are used to
fuse the residual parts that appear in network layer transmission [17]. This enables the network structure to adapt more flexibly to various data distributions
and patterns. The schematic diagram of the residual module used in the study is shown
in Fig. 2.
Fig. 2. Schematic diagram of residual module.
Based on the analysis in Fig. 2, the study utilized a 3-layer residual module, which consists of a combination of
2-layer $1\times1$ and 1-layer $3\times3$ convolutional layers. Reduce the dimensionality
of the input data was conducted through the convolution operation of the first layer
$1\times1$, that is, reduced 256 to 64, and then repaired it through the convolution
operation of the second layer $1\times1$. The residual structure can be represented
by formula (3).
In formula (3), $x_{l} $ represents shallow elements. $F$ represents residual unit. $W_{l} $ represents
deep units. By recursively inferring the residual structure, the characteristic expression
of any deep unit can be obtained, which can be represented by formula (4).
In formula (4), $x_{L} $ represents the characteristics of any deep unit. $\sum _{i=l}^{L-1}F $
represents residual unit. $W_{i} $ represents the value of deep units. By combining
formula (3) with formula (4), the formula for the sum of residual functions can be derived, which can be represented
by formula (5).
In formula (5), $x_{0} $ is the sum of the functions of all network layers. Through the above processing,
it can be ensured that the network gradient in the DRSN network always exists, thus
enabling feature recognition and extraction in tourism images.
3.2 Design of a Tourism Image Retrieval Model Combining Attention Mechanism and Improved
DRSN
Through analysis of the improved DRSN network model, it is found that the improved
DRSN mainly focuses on feature extraction and dimensionality reduction, and may lack
capture of image semantic information. The importance of semantic information in tourism
image retrieval is self-evident, therefore a method is needed to extract both image
features and semantic information simultaneously. Based on this, the study introduces
attention mechanism into an improved DRSN network for constructing a retrieval model
for tourism images [18,19]. The DRSN model that integrates attention mechanism can recognize the approximate
foreground position of target objects in tourism images. The attention module utilized
by the research institute consists of two parts, namely the backbone extraction module
and the branch mask extraction module. The backbone extraction module is feature extraction,
and the branch mask extraction module is feature selection. The output of the attention
module based on the backbone extraction module and the branch mask extraction module
can be represented by formula (6).
In formula (6), $M(x)$ represents the mask in branch extraction. $i$ represents all spatial positions
in the model. $c$ represents the model channel index. $T(x)$ represents the backbone
extraction output. The branch mask extraction module of attention in the DRSN model
can perform feature selection during the bidirectional propagation process, and can
also serve as a filter for gradient updating. At this point, the mask gradient of
the input feature can be represented by formula (7).
In formula (7), $\theta $ represents the parameter value of the branch mask. $\phi $ represents
the parameter value of the backbone branch. The characteristics of the mask can ensure
that the attention module is not affected by noise, while also preventing noise from
affecting the parameter extraction of the backbone branch gradient. However, through
research and development, it has been found that simple stacking in the DRSN model
can have a certain impact on the performance of the attention module. Therefore, the
study applies the identity function to the attention mask module, where the output
of the attention module can be updated to formula (8).
In formula (8), $F(x)$ represents the original features of the image. $F_{i,c} (x)$ represents residual
function. The range of $M(x)$ values is [0,1], where as $M(x)$ approaches 0, the output
of the attention module becomes closer to the original image features. Through the
above research, there are certain differences between the constructed attention residual
module and the original residual network in the model. The differences here can be
described using residual learning expressions, which can be expressed using formula
(9).
In the original residual network, $F_{i,c} (x)$ represents the residual function,
while in the attention residual module constructed in the study, $F_{i,c} (x)$ represents
the feature attention generated by the output of the convolutional network. The attention
layer residual module in the fusion of attention mechanism and DRSN network model
can use mask branches as feature selectors, thereby preserving the performance of
the backbone branch extraction module. At the same time, it can also quickly transfer
the original features of the image to the next layer to reduce the loss of feature
information. To enable the residual extraction module to collect more feature information
from the image, the attention residual module is added to the feature extraction stage
during the original feature extraction process [20]. The image feature extraction process based on the attention residual network model
is shown in Fig. 3.
Fig. 3. Image feature extraction flowchart based on attention residual network model.
During the feature extraction process, each feature map will be compressed into a
feature value similar to a real number. This feature value contains all the information
on the corresponding image. By performing global pooling on these feature values,
a vector weight can be obtained, which can be represented by formula (10).
In formula (10), $H$ represents the height value of the image. $W$ represents the width value of
the image. $u$ represents the pooling result. $z$ represents the global attention
value. $c$ represents the weight coefficient. After obtaining vector weights, all
network layers can be activated using an exponential linear activation function, which
can be represented by formula (11).
In formula (11), $W_{1} z$ represents the activation operation of the entire network. $W_{2} $ represents
the network layer. The importance of each feature image can be calculated using formula
(11). Based on the above research analysis, the tourism image retrieval process diagram
of the fusion attention mechanism and improved DRSN constructed is shown in Fig. 4.
Fig. 4. Flow chart of tourism image retrieval based on integrating attention mechanism
and improving DRSN.
4. Performance Analysis of Tourism Image Retrieval Model Based on Improved DRSN
To verify the performance of the tourism image retrieval model, the Landscape-dataset
datasets were used to test the model. The performance of the tourism image retrieval
model based on improved DRSN was evidenced by training the model on the dataset.
4.1 Performance Analysis of Tourism Image Retrieval Models
To verify the performance of tourism image retrieval models, the study visualized
the image features extracted by the model at each stage to facilitate a more intuitive
understanding of the mechanism of neural network feature extraction. At the same time,
more image feature extraction results are used to analyze the feature capability of
the model. The visualization results of feature extraction using landscape samples
are shown in Fig. 5.
Fig. 5. Visualization results of feature extraction from landscape samples.
From Fig. 5, it can be seen that the feature extraction effect of the research model after introducing
attention mechanism and other improvements is very obvious, while the non feature
areas are diluted to highlight the contour range of the features more prominently.
To verify the performance of the improved DRSN retrieval model trained on the dataset,
the Landscape-dataset(https://github.com/koishi70/Landscape-Dataset) were used to
test the model, and select 6000 images and divide them into two samples with higher
and lower complexity for testing, with a total of 3000 tourism images per sample.
Landscape Dataset is a large-scale natural landscape image dataset created and maintained
by developer Yuweiming70. The dataset contains tens of thousands of high-quality landscape
images, which are carefully labeled and classified according to different geographical
environments and weather conditions. Each category has a large number of samples,
ensuring diversity and generalization ability in training. The clear structure of
the dataset provides a resource for researchers in the fields of deep learning and
computer vision to train and test models, especially for tasks such as landscape classification
and object detection. The comparison results of the loss values of three methods on
two models are shown in Fig. 6.
Fig. 6. Comparison of loss values of three methods on two datasets.
According to Fig. 6(a), it can be seen that in the Landscape-dataset sample 1, the loss value of the improved
DRSN changed relatively steadily after 51 iterations, with an average loss value of
0.21. When RNN and CNN iterated 49 and 47 times respectively, the fluctuations slowed
down, but the amplitude remained relatively large. The average losses of the two were
0.48 and 0.62, respectively. From Fig. 6(b), in the Landscape-dataset sample 2, the loss value of the improved DRSN became relatively
flat at 39 iterations and tended to stabilize at 157 iterations, with an average loss
value of 0.16. When RNN and CNN iterated 47 and 41 times respectively, the fluctuations
slowed down, but the amplitude remained relatively large. The average losses of the
two were 0.51 and 0.93, respectively. This indicated that the tourism image retrieval
model based on improved DRSN had higher robustness. In order to verify the accuracy
of the retrieval model in image retrieval in the dataset, the study also compared
the accuracy of image retrieval using the above methods. The comparison results of
image retrieval accuracy between three methods on two datasets are shown in Table 1.
In Table 1, in the sample 1 test, the accuracy of the improved DRSN tended to stabilize after
118 iterations, with an accuracy of 94.52%. When RNN iterated 135 times, the accuracy
tended to stabilize, with an accuracy of 83.94%. When CNN iterated 157 times, the
accuracy region remained stable, with an accuracy of 72.52%. In the sample 2 test,
in the ResNet dataset, the image retrieval accuracy of the improved DRSN ess 90.88%,
while the image retrieval accuracy of RNN and CNN were 81.72% and 75.88%, respectively.
This indicated that in image retrieval capabilities, the image retrieval model based
on improved DRSN had higher accuracy and could increase the probability of images
being retrieved. In order to further validate the performance of the model in image
retrieval, the study compared image precision and recall as validation metrics. As
shown in Fig. 7, the comparison results of the precision and recall of three methods in the image
retrieval process are presented.
Table 1. Comparison of image retrieval accuracy of three methods on two datasets.
Algorithm
|
Data set
|
The number of iterations required to reach convergence (iterations)
|
Retrieval accuracy (%)
|
CNN
|
Sample 1
|
157
|
72.52
|
Sample 2
|
182
|
75.88
|
RNN
|
Sample 1
|
135
|
83.94
|
Sample 2
|
164
|
81.72
|
Improved DRSN
|
Sample 1
|
118
|
94.52
|
Sample 2
|
106
|
90.88
|
Fig. 7. Comparison results of precision and recall of three methods in two datasets.
According to Fig. 7(a), all three methods had high performance in image retrieval for tourism images. The
image accuracy of the improved DRSN was 92.61%, while the image accuracy of RNN and
CNN were 88.95% and 86.13%, respectively. According to Fig. 7(b), all three methods had good image recall performance in the process of tourism image
retrieval. The image recall rate of improved DRSN was 96.48%, while the image recall
rates of RNN and CNN were 91.05% and 89.22%, respectively. This indicated that the
tourism image retrieval model based on improved DRSN had strong robustness and applicability.
4.2 Application Performance Analysis of Tourism Image Retrieval Models
To verify the practical application performance of the tourism image retrieval model,
this study compared it with the Average Hash Algorithm (AHA). AHA converts an image
into a grayscale image, calculates the average brightness of the image, and then compares
the value of each pixel with the average brightness to generate a hash value, thereby
achieving image retrieval and maintaining high accuracy in efficient detection. The
volume of tourism data is huge, and the efficiency of the model is an important consideration
factor. The fast calculation speed and independence from image size of AHA make it
more efficient in processing large amounts of data, which is more in line with practical
applications. Therefore, AH was chosen as the comparative algorithm for the study.
In tourism images, different lighting conditions and occlusion could have a certain
impact on image retrieval and recognition. This study compared retrieval models with
AHA algorithms to verify the performance of models in tourism image retrieval in different
environments. As shown in Fig. 8, the recognition accuracy of two methods for the same tourism image in different
environments is shown.
Fig. 8. The recognition accuracy of two methods for the same tourism image in different
environments.
As shown in Fig. 8(a), under normal lighting conditions, the improved DRSN had the highest recognition
accuracy for tourism images, out of 300 images, 269 were accurately identified, with
an average accuracy of 89.51%, while AHA accurately identified 249 out of 300 images,
the average accuracy of AHA was 82.97%. As shown in Fig. 8(b), the image recognition accuracy of improved DRSN was also higher than that of AHA
for tourism images in dimly lit environments. The average accuracy of its retrieval
model was 59.64%, out of 300 images, 179 were accurately identified, while AHA accurately
identified 157 out of 300 images, the average accuracy of AHA was 52.33%. According
to Fig. 8 (c), it can be seen that in an occluded environment, both methods had a significant
decrease in the accuracy of image retrieval. This indicated that occlusions had a
significant impact on image retrieval, the accurate recognition numbers of the two
in 300 images are 124 and 116, respectively with accuracy rates of 41.25% and 38.66%,
respectively. To verify the impact of images captured from different angles on model
retrieval images, experiments were conducted in three different environments, and
the results are shown in Fig. 9.
Fig. 9. Comparison of ground cloud recognition results between two methods at different
observation angles.
In Fig. 9(a), the average recognition rate of the improved DRSN could reach 93.52% at a horizontal
angle, out of 500 images, 468 were accurately identified, while AHA accurately identified
444 out of 500 images, and the average recognition rate of AHA was 89.07%. As shown
in Fig. 9(b), the retrieval ability of the two methods for images slightly decreased at the elevation
angle. The average recognition rate of the improved DRSN was 79.68%, while the average
recognition rate of the traditional method was 73.14%, the accurate recognition numbers
of the two in 500 images are 400 and 366, respectively. As shown in Fig. 9(c), under vertical rotation, the image retrieval ability of the two significantly decreased,
the accurate recognition numbers of the two in 500 images are 216 and 201, respectively,
with recognition accuracy of 43.27% and 40.04%, respectively. This verified that the
retrieval model had high recognition ability under different shooting angles of images.
To further verify the operational capability of the retrieval model, the study used
retrieval time as a comparative indicator and compared the time consumption of improved
DRSN and AHA with the actual values. As shown in Fig. 10, the time consumption results of three methods in tourism image retrieval are shown.
Fig. 10. Comparison of time consumption of three methods in datasets with different
amounts of data.
According to the comparative analysis of Fig. 10(a), 10(b), and 10(c), it can be seen that the retrieval time for image retrieval of true values was 0.95s,
the retrieval time for improved DRSN was 1.28s, and the retrieval time for AHA was
1.46s. This indicated that in terms of time dimension, the difference between the
retrieval model and the true value was the smallest, with a difference of 0.27 seconds,
indicating that the retrieval model designed in the study could effectively retrieve
tourism images.
5. Conclusion
To improve the retrieval and recognition ability of tourism images, an improved DRSN
based on attention mechanism was proposed, and this network was used in the model
design of image retrieval and recognition. The study first utilized deactivation mechanisms
and activation functions to improve the DRSN, and applied it to feature extraction
in tourism images. Then, based on the improvement of the network, attention mechanism
was introduced to construct a retrieval model. The results showed that under normal
lighting, low lighting, and occlusion conditions, the image retrieval accuracy of
the retrieval model was 89.51%, 59.64%, and 41.25%, respectively. In terms of horizontal
angle, pitch angle, and vertical rotation angle, the retrieval model's ability to
retrieve and recognize images was 93.52%, 79.68%, and 43.27%, respectively. The simultaneous
retrieval model's ability to recognize image retrieval took 1.28 seconds, which was
very close to the true value, with a difference of 0.27 seconds. Among all comparison
indicators, the performance of the retrieval model was superior to that of the comparison
method. This indicated that the proposed attention mechanism-based improved DRSN tourism
image retrieval method had significant advantages in improving retrieval accuracy
and efficiency. This method provided a new and effective solution for the field of
tourism image retrieval, which is of great significance for practical applications.
However, there were still certain shortcomings in the research. The study only searched
for images in some datasets. In the future, further exploration of the application
of this method in other datasets and verification of its performance and robustness
through more experiments can be conducted.
REFERENCES
C. C. Chiu, W. J. Wei, L. C. Lee, and J. C. Lu, ``Augmented reality system for tourism
using image-based recognition,'' Microsystem Technologies, vol. 27, no. 4, pp. 1811-1826,
2021.

S. Zulzilah, E. Prihantoro, and S. Masitoh, ``The image tourism destinations of Bandung
in social media network,'' International Journal of Multicultural and Multireligious
Understanding, vol. 6, no. 10, pp. 72-83, 2019.

M. Hasanvand, M. Nooshyar, E. Moharamkhani, and A. Selyari, ``Machine learning methodology
for identifying vehicles using image processing,'' Artificial Intelligence and Applications,
vol. 1, no. 3, pp. 170-178, 2023.

Y. Duan. J. Wang, H. Ma, and Y. Sun, ``Residual convolutional graph neural network
with subgraph attention pooling,'' Tsinghua Science and Technology, vol. 27, no. 4,
pp. 653-663, 2021.

Z. Yang, J. Shang, Z. Zhang, Y. Zhang, and S. Liu, ``A new end-to-end image dehazing
algorithm based on residual attention mechanism,'' Gongye Daxue Xuebao/Journal of
Northwestern Polytechnical University, vol. 39, no. 4, pp. 901-908, 2021.

H. Han, L. Zhuo, J. Li, J. Zhang, and M. Wang, ``Blind image quality assessment with
channel attention based deep residual network and extended LargeVis dimensionality
reduction,'' Journal of Visual Communication and Image Representation, vol. 80, no.
10, 103296, 2021.

X. Su, Q. Zheng, Q. Zheng, and W. Xu, ``Effects of environmental attractiveness and
tourism image cognition of ecotourism on customer satisfation,'' Journal of Environmental
Protection and Ecology, vol. 21. no. 2, pp. 783-789, 2020.

M. Maree, A. Rattrout, M. Altawil, and M. Belkhatir, ``Multi-modality search and recommendation
on Palestinian cultural heritage based on the holy-land ontology and extrinsic semantic
resources,'' Journal on Computing and Cultural Heritage (JOCCH), vol. 14. no. 3, pp.
1-23, 2021.

P. Zheng, J. Li, J. Wang, H. Cheng, and Q. Wang, ``The coupling coordination of relationships
between tourism destination image and product country image,'' International Journal
of Tourism Research, vol. 23, no. 5, pp. 858-870, 2021.

E. Ageeva and P. Foroudi, ``Tourists' destination image through regional tourism:
From supply and demand sides perspectives,'' Journal of Business Research, vol. 101,
pp. 334-348, 2019.

D. Zhou, Y. Qian, Y. Ma, Y. Fan, J. Yang, and F. Tan, ``Low illumination image enhancement
based on multi-scale CycleGAN with deep residual shrinkage,'' Journal of Intelligent
& Fuzzy Systems, vol. 42, no. 3, pp. 2383-2395, 2022.

J. Yan, M. Shi, X. Lv, Y. Zhang, and Y. Ma, ``An inversion method for coupled typical
error sources based on remote sensing image,'' Journal of Imaging Science & Technology,
vol. 66, no. 6, 060503, 2022.

R. S. Thakur, R. N. Yadav, and L. Gupta, ``State‐of‐art analysis of image denoising
methods using convolutional neural networks,'' IET Image Processing, vol. 13, no.
13, pp. 2367-2380, 2019.

W. Xie, M. Cui, M. Liu, P. Wang, and B. Qiang, ``Deep hashing multi-label image retrieval
with attention mechanism,'' International Journal of Robotics & Autoation, vol. 37,
no. 4, pp. 372-381, 2022.

L. Shan, M. Yu, J. Xia, J. Xin, C. Deng, and L. Zhu, ``Overlapped spectral demodulation
of fiber Bragg grating using convolutional time-domain audio separation network,''
Optical Engineering, vol. 62, no. 6, 066104, 2023.

E. V. Diana and M. Sumathi, ``An intelligent deep learning architecture using multi-scale
residual network model for image interpolation,'' Journal of Advances in Information
Technology, vol. 14, no. 5, pp. 970-979, 2023.

Q. Wang, J. Lai, Z. Yang, K. Xu, and L. Lei, ``Improving cross-dimensional weighting
pooling with multi-scale feature fusion for image retrieval,'' Neurocomputing, vol.
363, no. 10, pp. 17-26, 2019.

Y. Zhu, Y. Wang, H. Chen, Z. Zuo, and Q. Huang, ``Large-scale image retrieval with
deep attentive global features,'' International Journal of Neural Systems, vol. 33,
no. 3, pp. 13-30, 2023.

Y. Li, Z. He, Z. Zhang, W. Zhang, P. Chatterjee, and D. Pamucar, ``A novel feature
aggregation approach for image retrieval using local and global features,'' CMES-Computer
Modeling in Engineering & Sciences, vol. 131, no. 1, pp. 239-262, 2022.

Z. Wang, ``Video summarization generation with self-attention and random forest regression,''
Proc of Second International Symposium on Computer Applications and Information Systems
(ISCAIS 2023), SPIE, vol. 12721, no .6, pp. 349-356, 2023.

Author
Renbi Zhao graduated from Guangdon Polytechnic Normal University in 2002 with a
bachelor’s degree in tourism management and service education. Currently, she holds
the position of Dean of the School of Management at Guangdong Nanfang Institute of
Technology with the title of Associate Professor. She is recognized as an Outstanding
Teacher in Private Education in Guangdong Province, an Outstanding Young Science and
Technology Pioneer in Jiangmen City, and an Outstanding Teacher in Jiangmen City.
She has published over twenty papers in various national journals, with her research
primarily focusing on tourism culture and tourism
resource development.