این سایت در حال حاضر پشتیبانی نمی شود و امکان دارد داده های نشریات بروز نباشند
صفحه اصلی
درباره پایگاه
فهرست سامانه ها
الزامات سامانه ها
فهرست سازمانی
تماس با ما
JCR 2016
جستجوی مقالات
یکشنبه 3 اسفند 1404
Journal of Artificial Intelligence and Data Mining
، جلد ۱۲، شماره ۳، صفحات ۳۳۷-۳۴۷
عنوان فارسی
چکیده فارسی مقاله
کلیدواژههای فارسی مقاله
عنوان انگلیسی
Deep Learning Approach for Robust Voice Activity Detection: Integrating CNN and Self-Attention with Multi-Resolution MFCC
چکیده انگلیسی مقاله
Voice Activity Detection (VAD) plays a vital role in various audio processing applications, such as speech recognition, speech enhancement, telecommunications, satellite phone, and noise reduction. The performance of these systems can be enhanced by utilizing an accurate VAD method. In this paper, multiresolution Mel- Frequency Cepstral Coefficients (MRMFCCs), their first and secondorder derivatives (delta and delta2), are extracted from speech signal and fed into a deep model. The proposed model begins with convolutional layers, which are effective in capturing local features and patterns in the data. The captured features are fed into two consecutive multi-head self-attention layers. With the help of these two layers, the model can selectively focus on the most relevant features across the entire input sequence, thus reducing the influence of irrelevant noise. The combination of convolutional layers and self-attention enables the model to capture both local and global context within the speech signal. The model concludes with a dense layer for classification. To evaluate the proposed model, 15 different noise types from the NoiseX-92 corpus have been used to validate the proposed method in noisy condition. The experimental results show that the proposed framework achieves superior performance compared to traditional VAD techniques, even in noisy environments.
کلیدواژههای انگلیسی مقاله
Voice Activity Detection,self-attention mechanism,multi-resolution Mel-Frequency Cepstral Coefficients,deep learning
نویسندگان مقاله
Khadijeh Aghajani |
Department of computer Engineering, Faculty of Engineering and Technology, University of Mazandaran, Babolsar, Iran.
نشانی اینترنتی
https://jad.shahroodut.ac.ir/article_3335_8731f540e6516b844b0e3b64b8931881.pdf
فایل مقاله
فایلی برای مقاله ذخیره نشده است
کد مقاله (doi)
زبان مقاله منتشر شده
en
موضوعات مقاله منتشر شده
نوع مقاله منتشر شده
برگشت به:
صفحه اول پایگاه
|
نسخه مرتبط
|
نشریه مرتبط
|
فهرست نشریات