In the wave of digital transformation, how to scientifically evaluate and guide the
level of the DigE and its impact on the green transformation of the economic belt
has become an important issue in today's economic research. To address this challenge,
the study has proposed a combination of superior and inferior solution distance method
(i.e. TOPSIS) and NN to evaluate the degree of superiority and inferiority of various
solutions in developing the economy [14].
2.1. Establishment of an Indicator System Based on CRITIC-TOPSIS
\textls[-10]{This manuscript is designed upon the DigE in the YREB. To comprehensively
reflect the research content, a large number of tertiary indicators connected with
the DigE will be collected. Before conducting empirical analysis, it is necessary
to conduct independence, redundancy, and universality tests to ensure the rationality
of the initially constructed indicator system and accurately reflect the DigE situation
of the YREB [15]. In the construction of an indicator system, if multiple indicators are involved,
a correlation coefficient matrix can be used to analyze the degree of correlation
between these indicators. Assuming the correlation coefficient matrix is $R^{{p}}
$, $p=1$, $2$, $\ldots$, $n$, as shown in Eq. (1).}
The average correlation coefficient RD of the redundancy indicator system is Eq. (2).
In Eq. (2), the value of $RD$ is between 0 and 1. The smaller the value of $RD$, the lesser
the redundant signal between the indicators in the indicator system, indicating a
lower correlation between the indicators. When $RD\le 0.5$, it can be considered that
the correlation between indicators is relatively low. When $RD\ge 0.5$ is present,
it means that there is a high correlation between various indicators and there may
be excess information [16]. The sensitivity test of the indicator system is shown in Eq. (3).
Following each alteration to a parameter, the evaluation outcomes of the indicator
system must be recalculated. The final result is Eq. (4).
According to Eq. (4), when the absolute value of $SD$ is less than 5, the structure of the indicator system
can be considered reasonable. The calculated value of $SD=1.4$, with an absolute value
less than 5, indicates that relatively stable and reliable evaluation results can
still be provided in different situations. Considering that $RD$ is 0.29, it indicates
that the correlation between indicators in the indicator system is low and there is
no excessive information redundancy. On the basis of the economic development data
of 9 provinces from 2017 to 2021, data preprocessing work is carried out, as shown
in Eq. (5).
The CRITIC is employed to determine the initial weights of each indicator and to establish
the original indicator data matrix, as illustrated in Eq. (6).
In Eq. (6), $n$ is the evaluation object amounts. $m$ is the evaluation indicator numbers. $x_{kij}
$ shows the indicator values. $k$, $i$, and $j$ respectively represent the year, evaluation
object, and evaluation index, with values ranging from ($k=1$, $2$, $\ldots$, $y$),
($i=1$, $2$, $\ldots$, $n$), ($j=1$, $2$, $\ldots$, $m$). The evaluation indicators
are shown in Eq. (7).
When the positive correlation between various indicators is stronger, it indicates
that they may have strong common influence or duplicate information in the evaluation
object, so the conflict between them is lower. The expressions for each indicator
are shown in Eq. (8).
In Eq. (8), $S_{j} $, $R_{j} $, $C_{j} $, and $W_{j} $ respectively represent the strength of
comparison, the conflict between indicators, the amount of information contained in
the evaluation indicators, and the objective weight of the indicators. The positive
ideal solution is Eq. (9).
In Eq. (9), $J_{1} $ represents a positive indicator and $J_{2} $ is a negative one. After the
raw data is standardized, the range of values for all indicators is scaled to between
0 and 1. The ideal solution refers to the situation where the maximum value of 1 is
achieved on all indicators, that is, all indicators have achieved the best performance.
Anti-ideal solution refers to the situation where the minimum value of 0 is achieved
on all indicators, that is, all indicators have achieved the worst performance [17].
In Eq. (10), $D_{i}^{+} $ and $D_{i}^{-} $ represent the distance from the evaluation object
to the active and inactive ideal solution, $x^{+} $ and $x^{-} $. $W_{j} $ means the
weight of the $j$ indicator. The relative closeness of each evaluation object is Eq.
(11).
In Eq. (11), $C_{i}^{*} \in [0$, $1]$, the closer the calculated closeness value is to 1, the
closer the calculated closeness value is to 1. Fig. 1 shows the structure diagram.
In Back Propagation Neural Networks (BPNN), the node numbers in the input layer are
usually the same as that of indicators in the DigE system. The determination of the
nodes in the hidden is an important issue, and there is no fixed standard or formula
to decide the nodes in the hidden. It is needed to try different node numbers, then
to compare the performance of the model to select the appropriate quantity of nodes
[18]. The formula is shown in Eq. (12).
In Eq. (13), $k$, $m$, and $n$ respectively represent the number of nodes in the hidden, input
and output layers. Through trial and error, the final number of nodes in the hidden
layer was ensured to be 13. Besides of determining the node amounts, the weight of
evaluation indicators should also be calculated, as shown in Eq. (13).
In Eq. (14), $x=w_{jk} $, and the correlation index is shown in Eq. (14).
In Eq. (15), $y=r_{ij} $, the absolute influence coefficient is Eq. (15).
In Eq. (15), $S$ represents the weight. $i$, $j$, $k$ represent the input unit, output unit,
and hidden unit of the NN, respectively, with a value range of ($i=1$, $2$, $\ldots$,
$m$), ($j=1$, $2$, $\ldots$, $n$), ($k=1$, $2$, $\ldots$, $P$).
2.2. Construction of a GDES for Economic Level in the YR Built in NNs and TOPSIS
The construction of the economic level and green development evaluation system (GDES)
in the YR area is grounded on the research of NN and TOPSIS methods. In Subsection
2.1, the study collects data related to economic level and green development (GD),
including economic indicators and environmental factors. Then, the collected data
are preprocessed, including data cleaning, missing value processing, and standardization,
to make sure the consistency of the data. Next, the TOPSIS is adopted to evaluate
the economic level and GD, and an NN is utilized to construct a model for the evaluation
system. The evaluation system flowchart is Fig. 2.
Fig. 2. GDES of economic level along the YR.
Fig. 2 shows the flowchart of the economic level and GDES for the YR Basin, covering data
collection, preprocessing, NN training, model validation, output generation, integration
with TOPSIS methods, and interpretation and analysis of results. Data collection is
the foundation for building an evaluation system, followed by data cleaning, missing
value processing, and standardization to ensure quality. After preprocessing, the
dataset is divided into a training set and a validation set for training and evaluating
the NN. During the training process, the training set data are input to learn data
patterns, and the model performance is evaluated through validation sets. The output
of the NN reflects the evaluation score of economic level and GD, which is then integrated
with the TOPSIS method for comprehensive evaluation. The Gini coefficient expression
is Eq. (16).
In Eq. (16), $n$ is the quantity of regions participating in the evaluation. $y_{i} $ means the
level of DigE and GD in the YR Basin, ranked from low to high, at the level of the
$i$-th province and city. $\mu $ is the mean DigE level and GD in the YR Basin, used
to measure the average level of the entire basin [19,20]. The kernel function generates a smooth curve at each sample point, and all these
smooth curves add up to form an estimated probability density function. The expression
is Eq. (17).
In Eq. (17), $N$ is the amount of research subjects. $x_{i} $ refers to independent and identically
distributed observations. $x$ is the average DigE of 9 observation provinces, used
to measure the average level of the entire YREB. $H$ represents the window width,
which is a parameter in kernel density estimation that controls the width of the kernel
function. The smoothness level $k(*)$ of the impact estimation represents the kernel
function. This article adopts Gaussian kernel function, which is often used for smoothing
in kernel density estimation [21,22]. The standard elliptical deviation is Eq. (18).
The standard deviation ellipse (SDE) has two half axes, i.e., the major-half and the
minor-half. The major-half axis means the major distribution direction of geographical
features, and its length reflects the degree of concentration of spatial movement
of geographical features [23]. The minor-half axis refers to the spatial distribution range of geographical features,
that is, the distance from the main distribution movement of geographical features.
The expressions for the $X$-axis and $Y$-axis standard deviations of the SDE are shown
in Eq. (19).
In Eq. (19), $(x_{i} ,y_{i} )$ represents each decision-making unit in the spatial area of the
research object. $w_{i} $ and $i$ are the weight and the index or number of each decision
unit, respectively. The horizontal and vertical distances of the decision units are
$x$ and $y$ from the center point of the ellipse. $\theta $ is the angle formed by
the clockwise rotation of the major axis of the SDE with due north. $\sigma _{x} $
and $\sigma _{y} $ are the SDEs in the $x$-axis and $y$-axis directions.