Machine learning-based self-powered acoustic sensor for speaker recognition

Keon Jae Lee

Machine learning-based self-powered acoustic sensor for speaker recognition

International Congress on AI and Machine Learning
August 02, 2021 | Webinar

Keon Jae Lee

Professor, KAIST, Korea

ScientificTracks Abstracts: ijircce

Abstract

Voice recognition is the most intuitive user-interface for bilateral communication between humans and smart devices. Speaker recognition has received spotlight as a next big thing of voice user interface (VUI) such as personalized voice-controlled assistant, smart home appliance, biometric authentication based on artificial intelligence (AI). The conventional speaker recognition was realized by a condenser type microphone, which detects sound by measuring the capacitance value between two conducting layers while supplying continuous power. The condenser type microphone, however, has critical demerits such as low sensitivity, high power consumption, and an unstable circuit due to the large gain amplification. Speaker recognition also suffers from a low recognition rate, caused by limited voice information and optimal algorithms for a simple and accurate process Herein, we reported a machine learning-based multi-channel resonant acoustic sensor by mimicking the basilar membrane of human cochlear. Highly sensitive self-powered flexible piezoelectric acoustic sensor (f-PAS) with a multi-resonant frequency band was employed to fabricate the basilar membrane (BM)-inspired f-PAS. The speech waveforms of standard TIDIGITS dataset were recorded by the multi-channel f-PAS and converted into frequency domain signals by using Fast Fourier Transform (FFT) and a Short-Time Fourier Transform (STFT) to obtain the characteristics of frequency components. Gaussian Mixture Model (GMM) and Convolutional Neural Network (CNN) were utilized for speaker recognition, resulted in a distributed Stochastic Neighbor Embedding (t-SNE) plot of STFT feature between training dataset and testing utterance. Finally, the f-PAS achieved a 97.5% speaker recognition rate with the 75% reduction of error rate compared to that of the reference MEMS microphone.

Biography

Keon Jae Lee received his Ph.D. in materials science and engineering (MSE) at the University of Illinois, Urbana-Champaign (UIUC). During his Ph.D. at UIUC, he was involved in the first co-invention of “flexible single-crystalline inorganic electronics”, using top-down semiconductors and soft lithographic transfer. Since 2009, he has been a professor in MSE at KAIST. His current research topics are self-powered flexible electronic systems including self-powered sensors/energy harvester, micro LEDs, neuromorphic memory/large scale integration (LSI) and laser material interaction for in vivo biomedical applications.