Given the difficulty in accurately labeling signals collected by mechanical systems,
extracting fault information directly from them is challenging. In the field of fault
mechanism, there are multiple models used to generate various fault simulation signals.
It is worth mentioning that the adversarial generative network composed of a generator
and a recognizer can generate highly realistic simulation data through internal interactive
games. Therefore, starting from the physical model, the study uses GAN as the generator
or discriminator of convolutional neural network models, and generates generated signals
that match the real system signal distribution by inputting annotated simulation signals.
Then, the study used these generated signals to train classifiers and conduct FD experiments.
This method aims to achieve efficient FD using low-cost unlabeled data.
3.1 Convolutional Network Architecture on the Ground of GAN
Adversarial generative networks, as a method of fitting distributions, have been extensively
utilized in the image processing. Convolutional neural networks (CNN) excel at extracting
features from images, while GANs excel at generating realistic images. When using
CNNs as part of GAN, it can help GAN better understand and simulate the complex structure
of image data, thereby generating higher quality images. It can obtain generated signals
that fit the true signal distribution, and can also be used for signal migration between
different distribution domains [13]. As an unsupervised method, adversarial generative networks can learn data distributions
and sample from them. In the convolutional network structure of GAN, the two main
parts are the generator and the recognizer [14]. The purpose of the generator is to generate realistic data samples from random noise.
In convolutional network-based GAN structures, the generator typically includes a
series of convolutional layers, which can be traditional convolutional layers, transposed
convolutional layers, or called deconvolution, often used to generate high-resolution
images from random noise. The convolutional layers included in the recognizer are
usually used to gradually extract low-level and high-level features in the image,
and finally output a scalar value through one or several fully connected layers, representing
the probability that the input sample is real data. The specific structure is shown
in Fig. 1.
In Fig. 1, the generator aims to generate data with a similar distribution to the learning
data. The two networks compete with each other during the training process, resulting
in a generator that produces high-quality data samples. In the mechanical fault diagnosis
model, the main function of GAN is to generate mechanical fault data, so as to provide
a rich and diversified fault data set to train and verify the fault diagnosis model.
In formula (1), $x$ represents the collected signal, and $P_{GD} $serves as the distribution of
generated data. $P_{RD} $serves as the distribution of real data,$G$ represents the
generator function, and $D$ represents the recognizer function. It solves the formula
and finds that when formula (1) is at its optimal value, the recognizer function is as shown in formula (2).
By combining formula (2) with formula (1), the expression of formula (1) is rewritten as formula (3).
In formula (3), $\zeta $ represents relative entropy. By calculating formula (3), its minimum value can be optimized to $\log 4$. Therefore, a well trained distributor
can achieve the goal of generating data with the same distribution as the real data.
However, there is still a risk of pattern collapse in GAN, so the study adopts the
Boundary Equilibrium Generative Adversarial Network (BEGAN). This network can quickly
and stably fit, and its structure is shown in Fig. 2.
Fig. 2. Network structure diagram of BEGAN.
In Fig. 2, the adversarial generative network has a generator that can map implicit changes
to two-dimensional images, and a recognizer that can map two-dimensional images to
implicit encoding. BEGAN uses W-distance as a measure of the distance between the
source domain (SDO) and the generated domain, as shown in formula (4) [15].
In formula (4), $W$ represents the W-distance, and $\mu _{1} $ represents the measurement of the
SDO. $\mu _{2} $ represents the generation domain, $\inf $serves as the infimum, and
$E$serves as the expected value of a random variable. For the W-distance, there is
a lower boundary called Lower Bound of Wasserstein Distance (LWD), which is expressed
as formula (5).
Formula (5) indicates that BEGAN does not directly optimize LWD. For a well trained autoencoder
in the SDO, it should be able to correctly decode and encode data that conforms to
the sampling of the SDO data distribution. Therefore, BEGAN uses the distribution
of encoding loss to measure the distribution of signals between different domains,
and the loss functions of the generator and recognizer are shown in formula (6).
In formula (6), $\theta _{D} $ represents the parameters of the recognizer, $\theta _{G} $ represents
the parameters of the generator, and $L$ represents the loss function. The optimization
goalis to reduce the encoding loss of the generated signal in the recognizer, while
the recognizer's goal is to balance the encoding loss of the real signal and the generated
signal. To adjust the generation quality and diversity, BEGAN introduces a hyperparameter
to balance the generation quality and diversity of the model. The expression of the
hyperparameter is shown in formula (7).
When the value of formula (7) is much less than 1, the model's attention will be focused on encoding the collected
signal. On the ground of the above, the goal of BEGAN can be obtained. The GAN structure
constructed above can generate fault signals close to the true value in fault diagnosis,
which is conducive to expanding the training set. In this way, Gans can also improve
the diversity of training data. The CNN structure is excellent at extracting signal
features with time and frequency characteristics, so the combination of GAN and CNN
can deal with problems such as the scarcity of annotated data, the diversity of failure
modes, and the need to simulate the distribution of data under real operating conditions.
3.2 FD on the Ground of BEGAN Convolutional Network and TL
The study applies the TL algorithm on the ground of BEGAN to FD, using annotated simulation
signals generated by dynamic models and real signals collected by test benches to
jointly train adversarial generation networks. The results are shown in Fig. 3.
Fig. 3. TL model process on the ground of BEGAN.
Fig. 4. Signal migration problem.
In Fig. 3, the labeled signal generated by the dynamic model is fed into the generator to obtain
the generated signal, which is then fed into the recognizer for training along with
the actual unlabeled signal collected. The recognizer is trained by determining whether
the input signal comes from a generated signal or a real collected signal. It uses
a recognizer to train the generator, so that the signal obtained by the generator
is the same as the real sampled signal, achieving the goal of adding real features
to the simulated signal [16]. Finally, the classifier is trained using annotated generated signals with the same
distribution as the real signal, achieving the goal of classifying the real signal.
Although BEGAN has easy training characteristics, it cannot guarantee that the input
and output signals belong to the same category. The study will analyze the transfer
problem. The possible issues that BEGAN may encounter during the migration process
are shown in Fig. 4.
In Fig. 4, the surface represents the distribution probability of high-level features, and
the relevant position distribution can be obtained by following different projection
surfaces. Different feature dimension reduction methods result in similar distributions,
but their corresponding relationships are completely opposite. Therefore, this situation
is referred to as a mapping error. To address this issue, a boundary self balancing
network with constraints between the generated signal and the target domain signal
has been studied. If pixel level regularization terms are added to constrain the changes
between $\mu _{1} $ and $\mu _{2} $, it will reduce the domain adaptation ability
of BEGAN. To achieve better regional adaptability and reduce mapping errors simultaneously,
the average value is shown in formula (8).
In formula (8), $N$ represents the number of signals. According to formula (8), the overall goal of the generator is shown in formula (9).
By using the balance mechanism of BEGAN and proportional integral control to maintain
the weights of two losses, the overall loss function can be obtained. Introducing
a denoising autoencoder into the BEGAN structure resulted in a Denoising Autoencoder
Boundary Equilibrium Generative Adversarial Network (DABEGAN) structure, as shown
in Fig. 5.
Fig. 5. Structure of generation and recognition autoencoder.
In Fig. 5, the generator and recognizer of DABEGAN include 3 layers of convolution and 3 layers
of deconvolution. The Cov2D 2D convolutional layer has a certain number of convolution
kernels, with a kernel size of $3 * 3$ and a step size of $2 * 2$. After passing through
various convolutional layers, the batch specification layer is used to standardize
the output features, and its activation function is chosen as the Leaky ReLu function,
as shown in formula (10).
In formula (10), $\psi $ represents a smaller normal number. Because the proposed DABEGAN has strong
domain adaptation ability, it can convert the SDO signal into a generated signal with
a distribution similar to that of the target domain signal. After determining the
structure and optimization of DABEGAN and classifier through the above steps, this
study establishes a bearing vibration dynamics model and analyzes its health status.
For the bearing system, the shaft rotation speed, failure frequency of the outer raceway,
and shaft rotation frequency are denoted as $\omega _{s} $, $\omega _{bpo} $, and
$\omega _{c} $, unitrad/s, respectively. Its calculation is shown in formula (11).
In formula (11), $\omega _{bpi} $ represents the fault frequency of the inner raceway, unitrad/s.
$D$ represents the bearing pitch. On the ground of formula (11), the contact deformation of the rolling element due to the presence of external forces
is shown in formula (12).
In formula (12), $x_{i} $ indicates the lateral displacement of the inner ring, in millimeters. $x_{0}
$ represents the lateral displacement of the outer ring, in millimeters. $y_{i} $
represents the longitudinal displacement of the inner ring, in millimeters. $y_{0}
$ represents the longitudinal displacement of the outer ring, in millimeters. $\delta
$ represents contact deformation,in millimeters. $\theta $ represents the rotation
angle of the rolling element, unit rad. $\varepsilon $ represents the radial clearance
inside the bearing, in millimeters. According to Hertz contact mechanics, the total
contact force between the inner and outer rings is able to be counted using formula
(13) [17,18].
In formula (13), $f$ represents the non-equilibrium force, unit N. $K$ represents Hertz contact elasticity
coefficient, unit N/m${}^{3/2}$. $\xi $ represents the load zone coefficient of the
rolling element, and $\xi $ can be used to determine whether the flag has entered
the fault zone. When the rolling element rolls into the fault zone, it will cause
a sudden increase in radial clearance, followed by a rapid decrease in radial deformation
in this direction. According to the Hertz formula, there is a direct relationship
between contact force and radial deformation. Therefore, the reduction of radial deformation
will lead to a decrease in Hertz force, causing a change in acceleration. The expression
of radial deformation is shown in formula (14).
In formula (14), $\vartheta $ represents whether the rolling element has reached the fault groove,
and $\Delta $ represents the gap that increases when the rolling element reaches the
fault groove, in millimeters. The situation when the rolling element enters the fault
is shown in Fig. 6.
Fig. 6. Changes in rolling element entering fault zone.
By determining the gap between the rolling element and the increased fault groove,
the Hertz force can be obtained using formula (14). By solving the formula, a signal spectrum can be obtained to analyze the bearing
state.