Mobile QR Code QR CODE

2024

Acceptance Ratio

21%

Main Menu

※ The user interface design of www.ieiespc.org has been recently revised and updated. Please contact inter@theieie.org for any inquiries regarding paper submission.

Journal Search

IEIESPC(IEIE Transactions on Smart Processing and Computing)

IEIESPC Vol. 10, No. 2, p.176-181

ISSN (print) :

2287-5255

Received : 18 September 2020Revised : 07 November 2020Accepted : 13 February 2021

DOI :

https://doi.org/10.5573/IEIESPC.2021.10.2.176

Regular Paper

* Extended from a Conference: Preliminary results of this paper were presented at the Summer Annual Conference of IEIE Summer 2020. This present paper has been accepted by the editorial board through the regular reviewing process that confirms the original contribution.

Improvement of Voice Quality using a Digital Filter to Compensate for the Spatial Characteristics of MEMS Microphone Chambers

KimHyun-Kab¹ KimGyu-Sik² LeeJongseol³

(Department of Electrical and Computer Engineering, University of Seoul / Seoul, Korea hyunkab@keti.re.kr)
(Department of Electrical and Computer Engineering, University of Seoul / Seoul, Korea gskim318@uos.ac.kr)
(Information & Media Research Center, Korea Electronic Technology Institute / Seoul, Korea leejs@keti.re.kr )

^* Corresponding Author: Gyu-Sik Kim

License :

This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.(www.theieie.org).

Abstract

Micro-electromechanical system (MEMS) microphones are sound sensors that are widely used in modern mobile and Internet-of-Things devices. Their market share is rapidly increasing due to their small size, excellent performance characteristics, and low power consumption. However, studies have focused on improving module performance without any spatial considerations. Most voice recognition devices have a small internal chamber in which a sensor is installed. A chamber with an open hole serves as a resonator, which alters the MEMS microphone’s characteristics. The aim of this study is to improve voice quality by using a digital filter to compensate for the effect of the chamber. This will allow the identification of frequency-response compensation characteristics and the acquisition of base data that can be used for product design.

Keywords

Digital filter, Digital signal processing, Installation space, MEMS microphone, Voice quality

1. Introduction

In recent years, small, thin electronic devices that are capable of sensing sound have become widespread. Accordingly, micro-electromechanical system (MEMS) microphones have become mainstream and have largely replaced electret condenser microphones (ECMs). MEMS microphones are manufactured via micro processes using semiconductor process technology. This has enabled the mass production of small, lightweight, and highly sensitive components. These components exhibit excellent performance with low power consumption despite their small physical size, so they are well suited for use in battery-powered mobile electronic devices. Furthermore, they can be used to improve voice recognition and sound processing as their small size means that multiple components can be installed in a single device.

Studies on MEMS microphones have mainly focused on improving the component’s characteristics and have not considered the actual installation space. Therefore, these studies do not reflect the component’s performance when it is installed in a device. This results in problems such as significant performance degradation and the need for re-optimization processes, which increase manufacturing difficulty. Therefore, studies are required to resolve these problems.

The aim of this study is to improve the voice quality of a MEMS microphone unit by using a filter to adapt to changes in the frequency response of the unit. The changes in the frequency response were calculated based on the characteristics of the installation space, and chambers were designed and fabricated to simulate different installation spaces. A MEMS microphone unit was installed in each chamber to make sample sets. The sample sets were validated using an estimated resonant frequency, which was calculated based on the resonance characteristics of the spaces. Then, the frequency-response characteristics were measured experimentally, and a digital filter was developed for the digital signal processing (DSP) range.

This study is significant because we estimated the change in input sensitivity that occurs when the sound input function is based solely on a unit’s single characteristic figures. In addition, we analyzed the causes of performance degradation in artificial intelligence (AI)-based voice and sound effect systems. The latter could be particularly helpful in resolving performance degradation problems in hybrid devices that use multiple MEMS microphones. These hybrid devices are affected by various performance variables, so the results of this study could be used to separate physical characteristic variables to resolve performance degradation problems more easily.

2. Background

MEMS microphones are mechanical sensor systems that include a thin pressure-sensitive vibration film that is deposited on a silicon wafer using a micro-semiconductor process ^[1]. These sensors can be divided into two main types: capacitance and piezoelectric element types. The capacitance element type was used in this study as it is highly widespread.

Fig. 1 shows a diagram of the simplified structure of a capacitive MEMS microphone. The location of the input port and the sensor design may vary between manufacturers, but the overall structure does not change significantly. When sound enters the sensor through the input port, the thin membrane behind the back plate vibrates. This causes the capacitance in the electrodes connected to the application-specific integrated circuit (ASIC) to change. The ASIC outputs the received signal as an analog audio signal by amplifying it. This analog signal may be converted to a digital signal using pulse density modulation (PDM).

The vibration systems used by most microphones are being replaced by microsensors produced using semiconductor processes. MEMS microphones have a huge advantage in terms of size and efficiency. Their overall performance has improved since they were first developed in the 1980s, and now, MEMS microphones have an excellent sound signature equivalent to other types. In addition, they require a low drive voltage of approximately 3 V, which is superior to units that require voltages of 12–48 V. Furthermore, MEMS microphones feature high sensitivity with low power consumption, which makes them suitable for battery-powered devices ^[2].

However, devices that use microphones are becoming smaller, so their internal space is limited. This requires numerous changes in the installation space and increases manufacturers’ optimization concerns. Consequently, there is greater technical demand for studies such as the one presented here.

Fig. 1. Structure of a capacitive MEMS microphone. PCB: printed circuit board. ASIC: application-specific integrated circuit.

3. Design and Experiment

3.1 Chamber’s Spatial Characteristics

To simulate MEMS microphone installation spaces, we used 3D modeling to design experimental chambers with equal volume and different sizes of holes. Fig. 2 shows a 3D model of the experimental chambers, and Fig. 3 shows samples produced using a 3D printer. The chambers were designed to be thin and small to reflect the dimensions of smartphones and laptops, where MEMS microphones are commonly installed. The chambers measured 20 ${\times}$ 20 ${\times}$ 3.5 mm$^{3}$ when a bottom panel mounted with a MEMS microphone was adjoined. The chambers were punched with holes of 1, 1.5, and 2-mm diameters.

Fig. 4 shows a cross-sectional diagram of the chambers with a MEMS microphone installed, which reflects the structure of chambers used for practical applications. The structure is akin to a Helmholtz resonator with a narrow hole and even internal volume. Therefore, the effectiveness of the design can be verified by comparing the experimental results to the maximum resonant frequency calculated using the equations for a Helmholtz resonator ^[3].

For the calculation, we established an equation for the internal volume, V = V1–V2, where A is the cross-sectional area of the input hole in Fig. 4, and 1 is the length of the hole. The pressure change outside the hole $\textit{Po}$$^{(t)}$ is caused by the air mass Q (kg/s) drawn into volume V per second. In addition, $\textit{Pi}$$^{(t)}$ is the pressure change inside volume V. When the hole resonates, the pressure change in the input part is very subtle, while the air inflow and outflow are maximized. Therefore, the frequency that minimizes the pressure change at the hole entrance $\textit{Po}$$^{(t)}$ is the maximum resonance point. The frequency that minimizes the pressure in the hole entrance is given by:

(1)

$ f=\frac{c}{2\pi }\sqrt{\frac{A}{V\left(V_{1}-V_{2}\right)l}} $

$\textit{c}$ is the velocity of sound in the measured space. To reflect the experimental environment, c was modified using:

(2)

$ c=\sqrt{\frac{\gamma RT}{M}}=331.5+\left(0.6\times \theta \right)\,\,\mathrm{m}/\mathrm{s} $

and

(3)

$ c=331.5+\left(0.6\times 22\right)\,\,\mathrm{m}/\mathrm{s}=233.7\,\,\mathrm{m}/\mathrm{s} $

The velocity of sound in the air is expressed in Eq. (2), where ${\gamma}$ is the adiabatic index, $\textit{R}$ is the gas constant, $\textit{T}$ is the temperature, and $\textit{M}$ is the air mass. It can be assumed that all of these terms except for the temperature are constant under the experimental conditions. Hence, entering the temperature in degrees Celsius gives the velocity of sound at the target temperature. The temperature was set to 22$^{\circ}$C, as shown in Eq. (3). The resonant frequencies for each diameter were obtained by substituting the calculated sound velocity into Eq. (1) along with the other values shown in Table 1. The results were rounded to one decimal place.

Fig. 2. 3D models of the experimental chambers.

Fig. 3. Photograph of the experimental chambers produced using a 3D printer.

Fig. 4. Cross section of a chamber with a MEMS microphone installed.

Table 1. Resonant frequency for each hole diameter.

Diameter (mm)	Temperature (°C)	Resonant Frequency (Hz)
1	22	824.4
1.5		1236.6
2		1648.8

3.2 Frequency Response Experiment

A MEMS microphone was installed in each of the experimental chambers, and the frequency response was recorded. The measurements were conducted in a complete anechoic chamber under ISO 3745 standards. An air-handling unit was used to maintain the target temperature of 22$^{\circ}$C. The microphone measurement method was used with a standard reference microphone for IEC 60268-4 ^[4,^5].

Fig. 5 shows a block diagram of the equipment and the experimental facility. The equipment included an Audio Precision APx555 signal analyzer, B&K 4191 reference microphone, Focal CHORUS-V 806V reference speaker, B&K 2690 conditioning amplifier, Crown XLi 2500 power amplifier, and Agilent E3632A DC power supply. The MEMS microphone used for this experiment was a bottom-port-type capacitive analog microphone unit, as shown in Fig. 6. The drive voltage was 3 V. The microphone was installed in the chambers as shown in Fig. 4. It was connected to the input/output and voltage load lines and covered with an airtight seal using rubber glue to prevent gaps.

The frequency responses for the chambers with holes of different diameters are illustrated in Fig. 7. The solid, thick dotted, and thin dotted lines represent the chambers with holes that were 1, 1.5, and 2 mm in diameter, respectively. The resonant points of each line have bandwidths comparable to those calculated in Table 1.

The effectiveness of the sample sets can be verified using the difference between the calculated and measured frequencies. The difference will be greater if the chambers are improperly designed or the MEMS microphone is malfunctioning. We produced a filter to compensate for the change in frequency in the opposite phase based on the measured frequency response characteristics. The compensation bandwidth was set to the primary bandwidth of the voice field, which ranges from 100 Hz to 4 kHz with a maximum of 20 dB considering the actual limit of the DSP gain.

A digital filter was applied to the signal. The frequency response characteristics were adjusted through a digital equalizer for the frequency modulated by the spatial characteristics. Based on the measurement result, the core of the implementation is to make detailed factors of digital equalization suitable for the voice bandwidth and apply them to the input gain of the DSP.

Fig. 5. Equipment and experimental facility.

Fig. 6. Bottom port-type capacitive analog microphone unit used as the MEMS microphone.

Fig. 7. Frequency responses for chambers with holes of different diameters.

3.3 Voice Quality Improvement Experiment and Result

To verify the improvement in voice quality that was obtained using the compensation filter, we applied a digital filter to the input/output path of the experimental signal. The frequency response filters obtained by simulating the chamber’s characteristics were applied to the input terminals. The compensation filters were designed to adjust for the primary bandwidth in the opposite phase and applied to the output terminals.

The digital filter is responsible for restoring the frequency modulation occurring in the main band that affects the intelligibility of speech. The recovery of frequency modulation through this signal processing has two advantages. First, unlike conventional analog circuit systems, there is no effect of electrical signal cancellation due to a phase difference. Furthermore, since there is no additional hardware, it also reduces cost.

Fig. 8 shows the digital filters that compensate for the frequency responses of the primary bandwidth. They were produced with consideration of the DSP’s application range and bandwidth. The solid, thick dotted, and thin dotted lines represent the compensation filters for holes that were 1, 1.5, and 2 mm in diameter, respectively. The improvement was measured using perceptual objective listening quality assessment (POLQA) with and without the compensation filters.

The speech transmission index (STI) evaluates voice quality using noise generation in an even narrowband. In contrast, POLQA is a voice quality testing method that uses subjective perception and evaluates quality by reflecting perceptual characteristics based on a voice-sound source. We used the POLQA V2.4 algorithm based on the ITU-T P.863 (2014) standards. The results are shown in Fig. 9 using MOS-LQO, which denotes the mean opinion score of listening-only quality and objective. The score ranges from 1 to 5, with higher numbers representing higher quality ^[6].

The POLQA test results for each hole diameter are shown in Table 2. For each diameter, the score increases when the filter was applied. Given the sensitive reaction to the diameter of the holes, the performance of the compensation filter can be improved though testing and estimation of spatial characteristics.

This difference is due to the superposition of digital filters that can be applied to the DSP without the need for additional sensors or components. Although it depends on the performance of the DSP processor, considering that multiple digital filters can be applied by overlapping, it can cope with frequency modulation from a more complex type of space. Existing intelligibility optimization work did not understand the frequency modulation of the structural space. Because of this, many trials and errors occur because the correct standards are not met, and this test result shows a way to shorten the time to worry about.

This study allows us to grasp the structural characteristics of the space where MEMS microphones are installed, and provides an improvement method for this in the area of digital signal processing. It analyzes the structural space and optimizes the physically unmodifiable parts using digital processing. This can produce high effects with low cost. Therefore, it has sufficient industrial value for manufacturers who manufacture or use MEMS microphone components. In addition, researchers studying MEMS microphones are expected to have sufficient value as an application technology for technology development because they can remove the effect of frequency modulation due to spatial effects and study the characteristics of the sensor.

Fig. 8. Digital filters that compensate for the frequency responses in the important bandwidth.

Fig. 9. MOS-LQO scores with a filter (top) and without one (bottom) for a chamber with a 1-mm-diameter hole.

Table 2. POLQA test results for each hole diameter.

Diameter (mm)	MOS-LQO before filter application	MOS-LQO after filter application
1	3.95	4.50
1.5	4.31	4.47
2	4.22	4.50

4. Conclusion

We have described a method of improving voice quality by compensating for the change in frequency caused by the spatial characteristics of a chamber where a MEMS microphone is installed within the general signal processing range. The results are technically significant in two ways. First, the chamber’s rough structure was easily modeled by simplifying the space and treating it as a Helmholtz resonator after separating and treating the single MEMS microphone unit as a whole. Second, the frequency bandwidth that exhibits a sensitive reaction was identified, and instant compensation was achieved using a digital filter. The extent of improvement was verified by calculating the response reactions based on measured characteristics of the chamber and the microphone input in a real anechoic environment rather than relying on assumptions about the characteristics of the simulation environment.

MEMS microphones are used in many devices such as smartphones, laptops, and tablets; home appliances such as TVs, refrigerators, and washing machines; and even everyday electronics such as digital door locks and alarms. They are widely used in electronic terminals for IoT services. Accordingly, it is necessary for manufacturers to optimize the performance of MEMS microphones in a variety of spatial environments ^[7]. However, most electronics manufacturers do not have a sound engineer or perform functional design independently. Further, it is even common for companies to conduct a design process without investigating spatial effects, and they conduct limited adjustment during the final tuning stage. Therefore, the results of this study may be used as a reference in the production of electronics so the desired sound sensor performance can be achieved from the development stage. This will be especially helpful for fields that require high-performance sound input, such as AI voice recognition systems. In future studies, we plan to develop a program that can predict the shape and characteristics of various chambers automatically and to test the quality of AI voice recognition as a functional extension.

ACKNOWLEDGMENTS

This work was supported by the global professional technology development project funded by the Ministry of Trade, Industry & Energy (20003934, 2019)

REFERENCES

Kwak Jun-Hyuk, Kang Hanmi, Lee YoungHwa, Jung Youngdo, Kim Jin-Hwan, Hur Shin, September 2014, Performance Test and Evaluations of a MEMS Microphone for the Hearing Impaired, Journal of Sensor Science and Technology, Vol. 53, No. 5, pp. 326-331

Kwon Hyu-sang, Lee Kwang-Cheol, January 2007, Design and fabrication of condenser microphone with rigid backplate and vertical acoustic holes using DRIE and wafer bonding technology, J. of the Korean Sensors Society, Vol. 16, No. 1, pp. 62-67

Ryu Hokyung, Chung Seong Jin, Lee Jin Woo, April 2014, Design of a Helmholtz Resonator for Noise Reduction in a Duct Considering Geometry Information: Additional Relationship Equation and Experiment, Korean Soc. Mech. Eng. A, Vol. 38, No. 4, pp. 459-468

Kim Keon-Wook, March 2012, Design and Analysis of Experimental Anechoic Chamber for Localization, The Journal of the Acoustical Society of Korea, Vol. 31, No. 4, pp. 225-234

2014, TC 100/TA 20 - Analogue and digital audio, IEC 60268-4 Sound system equipment - Part 4:, Microphones, International Electrotechnical Commission

Beerends John G., Neumann Niels M. P., Broek Egon L. van den, Casanovas Anna Llagostera, Menendez Jovana Torres, Schmidmer Christian, Berger Jens, December 2019, Subjective and Objective Assessment of Full Bandwidth Speech Quality, IEEE/ACM Transactions on Audio, Speech, Language, Processing, Vol. 28, pp. 440-449

Kim Haeyong, Park Huiung, Kim Seon-Tae, February 2020, Light-weight Device Platform Supporting Large-scale Device Network, Journal of the Institute of Electronics and Information Engineers, Vol. 52, No. 2, pp. 37-42

Author

Hyun-Kab Kim

Hyun-Kab Kim is a graduate student in the Department of Electrical and Computer Engineering at the Uni-versity of Seoul. In 2018, he received a master's degree in the Department of Electrical and Computer Engineering at the University of Seoul. He is a researcher at the Korea Electronics Technology Institute (KETI) Information Media Research Center. Until 2013, he was a technical advisor to JayWorks Korea. His current research interests include system control, sound systems, sensor networks, and signal processing.

Gyu-Sik Kim

Gyu-Sik Kim is a professor of Electrical and Computer Engineering at the University of Seoul. He graduated from Seoul National University with a degree in electronic engineering in 1981. He earned his master's degree in control measure-ment engineering at Seoul National University in 1983. In 1990, he earned his Ph.D. from the Department of Control and Measurement Engineering at Seoul National University. He served as a senior researcher at Daewoo Heavy Industries' Central Research Institute until 1992. He served as a visiting professor at the University of Wisconsin-Madison from 2003 to 2005. His current research interests include sensor networks, nonlinear control, and energy conversion.

Jongseol Lee

Jongseol Lee is a senior researcher at the Korea Electronics Technology Institute's Information & Media Research Center. He graduated from Chungbuk National University in 1996 with a major in information and communication engineering. He earned his master's degree in infor-mation and communication engineering at Chungbuk National University in 2001. He earned his doctorate in computer science at Konkuk University in 2018. He worked as a full-time researcher at Samsung Electronics in 2001. His current research interests include information and communication, computer technology, IoT applications, and media technology.