[1] Furui, S. (1996). An Overview of Speaker Recognition Technology. In C-H. Lee, F. K. Soong, & K. K. Paliwal (Eds.),
Automatic Speech and Speaker Recognition: Advanced Topics (pp. 31-56). Springer.
https://doi.org/10.1007/978-1-4613-1367-0_2
[2] Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P., & Ouellet, P. (2011). Front-End Factor Analysis for Speaker Verification.
Institute of Electrical and Electronics Engineers Transactions on Audio, Speech, and Language Processing,
19(4), 788-798.
https: //doi.org/10.1109/TASL.2010.2064307
[3] Okabe, K., Koshinaka, T., & Shinoda, K. (2018, September 2-6).
Attentive statistics pooling for deep speaker embedding [Conference session]
. 2018 International Speech Communication Association, Hyderabad, India.
https://doi.org/10.21437/inte rspeech.2018-993
[4] Mohammad Amini, M., & Matrouf, D. (2021, January 18-21).
Data augmentation versus noise compensation for x-vector speaker recognition systems in noisy environments [Conference session]
. 28th European Signal Processing Conference, Amsterdam, Netherlands.
https://doi.org/10.23919/Eusipco47968.2020.9287690
[5] VoxCeleb. (n.d.).
VoxCeleb: Large-scale audio-visual datasets of human speech. https:/ /mm.kaist.ac.kr/datasets/voxceleb/#downloads
[6] Openslr. (n.d.).
LibriSpeech ASR corpus. https://www.openslr.org/12
[8] Hom, K. L., Beigi, H., & Betti, R. (2022). Application of Speaker Recognition x-Vectors to Structural Health Monitoring. In Z. Mao (Ed.),
Model Validation and Uncertainty Quantification, Volume 3 (pp. 139-148). Springer International Publishing.
http s://doi.org/10.1007/978-3-030-77348-9_18
[9] Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models.
Institute of Electrical and Electronics Engineers Transactions on Speech and Audio Processing,
3(1), 72-83.
https://do i.org/10.1109/89.365379
[10] Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker Verification Using Adapted Gaussian Mixture Models.
Digital Signal Processing,
10(1-3), 19-41.
https://doi. org/10.1006/dspr.1999.0361
[11] Kenny, P., Boulianne, G., Ouellet, P., & Dumouchel, P. (2007). Joint Factor Analysis Versus Eigenchannels in Speaker Recognition.
Institute of Electrical and Electronics Engineers Transactions on Audio, Speech, and Language Processing,
15(4), 1435-1447.
htt ps://doi.org/10.1109/TASL.2006.881693
[12] Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., & Khudanpur, S. (2018, April 15-20).
X-Vectors: Robust DNN Embeddings for Speaker Recognition [Conference session]
. 2018 Institute of Electrical and Electronics Engineers International Conference on Acoustics, Speech and Signal Processing, Calgary, Alberta, Canada.
https://d oi.org/10.1109/ICASSP.2018.8461375
[13] Kanagasundaram, A., Sridharan, S., Ganapathy, S., Singh, P., & Fookes, C. (2019, September 15-19).
A study of x-vector based speaker recognition on short utterances [Conference session]
. Proceedings of the 20th Annual Conference of the International Speech Communication Association, Graz, Austria.
https://doi.org/10.21437/Interspe ech.2019-1891
[14] Jahangir, R., Teh, Y. W., Memon, N. A., Mujtaba, G., Zareei, M., Ishtiaq, U., Akhtar, M. Z., & Ali, I. (2020). Text-Independent Speaker Identification Through Feature Fusion and Deep Neural Network.
Institute of Electrical and Electronics Engineers Access,
8, 32187-32202.
https://doi.org/10.1109/ACCESS.2020.2973541
[15] Tripathi, M., Singh, D., & Susan, S. (2020). Speaker Recognition Using SincNet and X-Vector Fusion. In L. Rutkowski, R. Scherer, M. Korytkowski, W. Pedrycz, R. Tadeusiewicz, & J. M. Zurada (Eds.),
Artificial Intelligence and Soft Computing (pp. 252-260). Springer International Publishing.
https://doi.org/10.1007/97 8-3-030-61401-0_24
[16] Rouvier, M., Dufour, R., & Bousquet, P. M. (2021, January 18-21).
Review of different robust x-vector extractors for speaker verification [Conference session]
. 28th European Signal Processing Conference, Amsterdam, Netherlands.
https://doi.org/10.23 919/Eusipco47968.2020.9287426
[17] Wu, Z., Wang, S., Qian, Y., & Yu, K. (2019, September 15-19).
Data Augmentation Using Variational Autoencoder for Embedding Based Speaker Verification [Conference session]
. Proceedings of the 20th Annual Conference of the International Speech Communication Association, Graz, Austria.
http://dx.doi.org/10.21437/Inters peech.2019-2248
[18] Taherian, H., Wang, Z. Q., Chang, J., & Wang, D. (2020). Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement.
Institute of Electrical and Electronics Engineers/Association for Computing Machinery Transactions on Audio, Speech, and Language Processing,
28, 1293-1302.
https://doi.org/10. 1109/TASLP.2020.2986896
[19] Kataria, S., Nidadavolu, P. S., Villalba, J., Chen, N., García-Perera, P., & Dehak, N. (2020, May 4-8).
Feature Enhancement with Deep Feature Losses for Speaker Verification [Conference session]
. 2020 Institute of Electrical and Electronics Engineers International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain.
https://doi.org/10.1109/ICASSP40776.2020.9053110
[20] Zeinali, H., Sameti, H., & Stafylakis, T. (2018, June 26-29).
DeepMine Speech Processing Database: Text-Dependent and Independent Speaker Verification and Speech Recognition in Persian and English [Conference session]
. The Speaker and Language Recognition Workshop, Les Sables d'Olonne, France.
http://dx.doi.org/10.21437/Odyssey. 2018-54
[21] Khoa, T. D., & Tsai, T. H. (2020, October 30-31).
A Text-Independent Speaker Verification for SdSV Challenge 2020 [Conference session]
. 2020 Institute of Electrical and Electronics Engineers 5th International Conference on Computing Communication and Automation, Greater Noida, India.
https://doi.org/10.1109/ICCCA49541.2 020.9250773
[22] Khosravani, A., & Homayounpour, M. M. (2017). A PLDA approach for language and text independent speaker recognition.
Computer Speech & Language,
45, 457-474.
https://doi.org/10.1016/j.csl.2017.04.003
[24] Ravanelli, M., Parcollet, T., Plantinga, P., Rouhe, A., Cornell, S., Lugosch, L., Subakan, C., Dawalatabad, N., Heba, A., Zhong, J., Chou, J-C., Yeh, S-L., Fu, S-W., Liao, C-F., Rastorgueva, E., Grondin, F., Aris, W., Na, H., Gao, Y., De Mori, R., & Bengio, Y. (2021). SpeechBrain: A general-purpose speech toolkit.
arXiv, 1-34.
https://doi.org/10 .48550/arXiv.2106.04624
[25] Sagayam, K. M., Bruntha, P. M., Sridevi, M., Renith Sam, M., Kose, U., & Deperlioglu, O. (2021). A cognitive perception on content-based image retrieval using an advanced soft computing paradigm. In T. Gandhi, S. Bhattacharyya, S. De, D. Konar, & S. Dey (Eds.),
Advanced Machine Vision Paradigms for Medical Image Analysis (pp. 189-211). Academic Press.
https://doi.org/10.1016/B978-0-12-819295-5.00007-X
[26] Butterworth, S. (1930). On the theory of filter amplifiers.
Wireless Engineer,
7(6), 536-541.
https://www.changpuak.ch/electronics/downloads/On_the_Theory _of_Filter_Amplifiers.pdf