IEIE SPC - IEIE Transactions on Smart Processing & Computing

Mobile QR Code QR CODE

QR CODE

2025

Reject Ratio

81.5%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search


Title	A 1D CNN-LSTM using Wav2vec 2.0 for Violent Scene Discrimination
Authors	(Huiyong Bak) ; (Sangmin Lee)
DOI	https://doi.org/10.5573/IEIESPC.2022.11.2.92
Page	pp.92-96
ISSN	2287-5255
Keywords	Violent scene discrimination; Wav2vec 2.0; Audio signal processing
Abstract	In this paper, an effective system for discriminating violent scenes in movies from audio signals alone is proposed. The technology for automatic discrimination of violent scenes is one of the most crucial aspects of media filtering, protecting users from undesired media. Previous studies have conducted violent scene discrimination using a mel spectrogram and 2D convolutional neural networks (CNNs); however, the mel spectrogram cannot extract mutual information from audio, and 2D CNNs are unsuitable for audio. Therefore, these models do not yield good performance. The system proposed in this paper extracts audio features by using Wav2vec 2.0, which can extract mutual information from audio. The features of the extracted audio are inputted to a 1D CNN and long short-term memory (LSTM), which are algorithms suitable for audio, and violent scenes are discriminated through fully connected and softmax layers. To evaluate the proposed system, violent scenes are discriminated using the Violent Movie Scenes Dataset (VMD). As a result, the accuracy of the proposed system when discriminating violent scenes is 96.25%, providing better performance than in previous studies.

Copyright © IEIE All right's reserved

No part of this publication may be reproduced or distributed in any form or any means, or stored in a data base or retrieval system, without the prior permission of the publisher(www.theieie.org).

ISSN : 2287-5255 (Online)