Text Independent Biometric Speaker Recognition System

Download Full Text
Luqman Gbadamosi
Published Date:
November 05, 2013
Volume 3, Issue 6
9 - 15

mfcc, voice print, vqlbg, voice recognition
Luqman Gbadamosi, "Text Independent Biometric Speaker Recognition System". International Journal of Research in Computer Science, 3 (6): pp. 9-15, November 2013. doi:10.7815/ijorcs.36.2013.073 Other Formats


Designing a machine that mimics the human behavior, particularly with the capability of responding properly to spoken language, has intrigued engineers and scientists for centuries. The earlier research work on voice recognition system which is text-dependent requires that the user must say exactly the same text or passphrase for both enrollment and verification before gaining access. In this method the testing speech is polluted by additive noise at different noise decibel levels to achieve only 75% recognition rate and would require full cooperation by the speaker which could not be used for forensic investigation. This paper presents the historical background, and technological advances in voice recognition and most importantly the study and implementation of text-independent biometric voice recognition system which could be used for speaker identification with 100% recognition rate. The technique makes it possible to use the speaker's voice to verify their identity and control access to services such as voice dialing, telephone shopping, database access services, information services, voice mail, and remote access to computers. The implementation mainly incorporates Mel frequency Cepstral Coefficient (MFCCs) which was used for feature extraction and Vector quantization using the Linde-Buzo-Gray (VQLBG) algorithm used to minimize the amount of data to be handled. The matching result is given on the basis of minimum distortion distance. The project is coded in MATLAB.

  1. Alexander, A., Botti, F., Dessimoz, D., Drygajlo, A., “The effect of mismatched recording conditions on human and automatic speaker recognition in forensic applications”. Forensic Science International 146S, December 2004, pp. 95–99. doi: 10.1016/j.forsciint.2004.09.078
  2. Gonzalez-Rodriguez, J., Garcia-Gomar, D. G.-R. M., Ramos-Castro, D., Ortega-Garcia, J. “Robust likelihood ratio estimation in Bayesian forensic speaker recognition”. In: Proc. 8th European Conf. on Speech Communication and Technology (Eurospeech 2003), Geneva, Switzerland, September 2003, pp. 693–696.
  3. Niemi-Laitinen, T., Saastamoinen, J., Kinnunen, T., Fränti, P., 2005. “Applying MFCC-based automatic speaker recognition to GSM and forensic data”. In: Proc. Second Baltic Conf. on Human Language Technologies (HLT’2005), Tallinn, Estonia, April 2005, pp. 317–322. doi: 10.1016/j.specom.2009.08.009
  4. Pfister, B., Beutler, R., 2003. “Estimating the weight of evidence in forensic speaker verification”. In: Proc. Eighth European Conf. on Speech Communication and Technology (Eurospeech 2003), Geneva, Switzerland, September 2003, pp. 701–704.
  5. Thiruvaran, T., Ambikairajah, E., Epps, J., 2008. “FM features for automatic forensic speaker recognition”. In: Proc. Interspeech 2008, Brisbane, Australia, September 2008, pp. 1497–1500.
  6. Palden Lama and Mounika Namburu, “Speech Recognition with Dynamic Time Warping using MATLAB”, CS 525, SPRING 2010-PROJECT REPORT
  7. B. H. Juang, L. R. Rabiner, “Automatic Speech Recognition – A Brief History of the Technology Development”, Elsevier Encyclopedia of Language and Linguistics (2005)
  8. R. P. Lippmann, “Review of Neural Networks for Speech Recognition, Readings in Speech Recognition”, A. Waibel and K. F. Lee, Editors, Morgan Kaufmann Publishers, pp. 374-392, 1990
  9. B.H. Juang, C.H. Lee and Wu Chou, “Minimum Classification Error Rate Methods for Speech Recognition”, IEEE Trans. Speech & Audio Processing, T-SA, vo.5, No.3, pp.257-265, May 1997. doi: 10.1109/89.568732
  10. L. R. Bahl, P. F. Brown, P. V. deSouza and L. R. Mercer, “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition”, Proc. ICASSP 86, Tokyo, Japan, pp. 49-52, April 1986. doi: 10.1109/ICASSP.1986.1169179
  11. S. Furui, “Fifty years of progress in speech and speaker recognition”, Proc. 148th ASA Meeting, 2004. doi: 10.1121/1.4784967
  12. S. K. Singh, Prof P. C. Pandey, “Features and Techniques for Speaker Recognition”, M. Tech. Credit Seminar Report, Electronic Systems Group, EE Dept, IIT Bombay submitted Nov 03.
  13. Davis S., Mermelstein P., “Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences”. In IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 28 No. 4, pp. 357-366, 1980. doi: 10.1109/TASSP.1980.1163420
  14. “Vector Quantizer Encoder: Blocks(Signal Processing Blockset)”, The Mathworks incorporation, 1982-2008
  15. ITU-T Recommendation G.711, "Pulse Code Modulation (PCM) of Voice Frequencies", General Aspects of Digital Transmission Systems; Terminal Equipments, International Telecommunication Union (ITU), 1993.
  16. Beigi, Homayoon (2011). “ Fundamentals of Speaker Recognition.”.[Online].Available: http://www.wikipedia.org/wiki/speaker_recognition.
  17. Course project (Fall 2009 ) “Voice Recognition Using MATLAB”. California State University Northridge during the semester. [Online] Available: http://www.cnx.org/content/m33347/1.3/module_export?format=zip
  18. 2012, “Article on Human Voice” [Online]. Available: http://www.wikipedia.org/wiki/Human voice.
  19. “Techniques of Voice Recognition System” [Online].Available:http://www.hitl.washington.edu/scllw/EVE/I.D.2.d.VoiceRecognition.htm
  20. “Probability Tutorials on Chebyshevs-Inequality” [Online]. Available: http://www.statistics.about.com /od/probHelpandTutorials/a/Chebyshevs-Inequality.htm
  21. Sangram Bana, “Fingerprint Recognition System using Image Segmentation”. International Journal of Advanced Engineering Sciences and technologies Vol No. 5, Issue No. 1, 012 – 023
  22. Kapil Sharma, H.P Sinha & R.K Aggarwal “Comparative study of speech Recognition System using various feature extraction techniques”. International Journal of Information Technology and Knowledge Management Vol 3, No2, pp. 695-698.
  23. Mahima Garg, Omar Razi, Supriya Phutela, Vaibhav Kapoor, Varun Chopra, “Voice Recognition and Identification System in MATLAB” Final Project Report [Online]. Available: http://www.youtube.com/ watch?v=UgBlJJ83oo0
  24. Mohammed Waleed Kadous ,”Machine Learning Reasearch” [Online]. Available: http://www.cse.unsw. edu.au/~waleed/phd/html/node38.html, downloaded on 3rd March 2010.
  25. Zaidi Razak, Noor Jamilah Ibrahim, Emran Mohd Tamil, Mohd Yamani Idna Idris, Mohd Yaakob Yusoff, “Quranic Verse Recitation Feature Extraction Using Mel Frequency Coestral Coefficient(MFCC)”, Universiti Malaya.
  26. Lindasalwa Muda, Mumtaj Begam and I. Elamvazuthi, “Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques” Journal of computing, Vol 2, Issue 3, March 2010,

  • Maged, Heba, Ahmed Abou El-Farag, and Saleh Mesbah. "Improving speaker identification system using discrete wavelet transform and AWGN." Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on. IEEE, 2014.