Homepage of Ville Vestman
Photo of Ville Ville Vestman
PhD in Computer Science, MSc in Mathematics
Ex early state researcher
Computational Speech Group
School of Computing
University of Eastern Finland
Research topics: Speech technology, automatic speaker recognition systems
Email: ville.vestman [at] gmail.com

Events

Tuesday 10.11.2020 at 10:15 (UTC+2): My Ph.D. defense (Press release & link to the event)

Program codes

  1. ASVtorch speaker verification toolkit (PyTorch)
  2. GPU accelerated implementation of i-vector extractor (training / extraction) using PyTorch

Refereed journal articles

  1. K. A. Lee, V. Vestman, T. Kinnunen, "ASVtorch Toolkit: Speaker Verification with Deep-Neural Networks", SoftwareX, Volume 14, 100697, June 2021, [Link]
  2. A. Nautsch, X. Wang, N. Evans, T. Kinnunen, V. Vestman, M. Todisco, H. Delgado, M. Sahidullah, J. Yamagishi, K.A. Lee, "ASVspoof 2019: Spoofing countermeasures for the detection of synthesized, converted and replayed speech", IEEE Transactions on Biometrics, Behavior, and Identity Science, Volume 3, no. 2, pp. 252–265, April 2021 [arXiv]
  3. T. Kinnunen, H. Delgado, N. Evans, K.A. Lee, V. Vestman, A. Nautsch, M. Todisco, X. Wang, M. Sahidullah, J. Yamagishi, D.A. Reynolds, "Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals", to appear in IEEE/ACM Transactions on Audio, Speech, and Language Processing [arXiv]
  4. X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K.A. Lee, L. Juvela, P. Alku, Y.-H. Peng, H.-T. Hwang, Y. Tsao, H.-M. Wang, S. L. Maguer, M. Becker, F. Henderson, R. Clark, Y. Zhang, Q. Wang, Y. Jia, K. Onuma, K. Mushika, T. Kaneda, Y. Jiang, L.-J. Liu, Y.-C. Wu, W.-C. Huang, T. Toda, K. Tanaka, H. Kameoka, I. Steiner, D. Matrouf, J. -F. Bonastre, A. Govender, S. Ronanki, J.-X. Zhang, Z.-H. Ling, "ASVspoof 2019: A large-scale public database of synthetic, converted and replayed speech", Computer Speech & Language, Volume 64, 101114, November 2020 [arXiv]
  5. A. Sholokhov, T. Kinnunen, V. Vestman, K. A. Lee, "Voice Biometrics Security: Extrapolating False Alarm Rate via Hierarchical Bayesian Modeling of Speaker Verification Scores", Computer Speech & Language, Volume 60, 101024, March 2020 [arXiv]
  6. V. Vestman, T. Kinnunen, R. González Hautamäki, M. Sahidullah, "Voice Mimicry Attacks Assisted by Automatic Speaker Verification", Computer Speech & Language, Volume 59, January 2020, pp. 36–54 [pdf]
  7. V. Vestman, D. Gowda, M. Sahidullah, P. Alku, T. Kinnunen, "Speaker Recognition from Whispered Speech: A Tutorial Survey and an Application of Time-Varying Linear Prediction", Speech Communication, Volume 99, May 2018, pp. 62–79 [pdf]

Refereed conference papers

  1. M. Sahidullah, A. K. Sarkar, V. Vestman, X. Liu, R. Serizel, T. Kinnunen, Z.-H. Tan, E. Vincent, "UIAI System for Short-Duration Speaker Verification Challenge 2020", Accepted to SLT 2021 [arXiv]
  2. A. Sholokhov, T. Kinnunen, V. Vestman, K. A. Lee, "Extrapolating False Alarm Rates in Automatic Speaker Verification", Proc. Interspeech 2020, Shanghai, China, October 2020, pp. 4218–4222 [arXiv]
  3. V. Vestman, K. A. Lee, T. H. Kinnunen, "Neural i-vectors", Proc. Odyssey: The Speaker and Language Recognition Workshop, Tokyo, Japan, November 2020, pp. 67–74 [arXiv]
  4. V. Vestman, K. A. Lee, T. H. Kinnunen, T. Koshinaka, "Unleashing the Unused Potential of I-Vectors Enabled by GPU Acceleration", Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 351–355 [arXiv]
  5. M. Todisco, X. Wang, V. Vestman, M. Sahidullah, H. Delgado, A. Nautsch, J. Yamagishi, N. Evans, T. Kinnunen, K. A. Lee, "ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection", Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 1008–1012 [arXiv]
  6. K. A. Lee, V. Hautamäki, T. Kinnunen, H. Yamamoto, K. Okabe, V. Vestman, J. Huang, G. Ding, H. Sun, A. Larcher, R. K. Das, H. Li, M. Rouvier, P. Bousquet, W. Rao, Q. Wang, C. Zhang, F. Bahmaninezhad, H. Delgado, M. Todisco, Q. Wang, L. Guo, T. Koshinaka, J. Zhang, K. Shinoda, T. N. Trong, M. Sahidullah, F. Lu, Y. Tang, M. Tu, K. K. Teh, H. D. Tran, K. K. George, I. Kukanov, F. Desnous, J. Yang, E. Yılmaz, L. Xu, J. Bonastre, C. Xu, Z. H. Lim, E. S. Chng, S. Ranjan, J. H. L. Hansen, J. Patino, N. Evans, "I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences", Proc. Interspeech 2019, Graz, Austria, September 2019, pp. 1497–1501 [arXiv]
  7. V. Vestman, B. Soomro, A. Kanervisto, V. Hautamäki, T. Kinnunen, "Who Do I Sound Like? Showcasing Speaker Recognition Technology by Youtube Voice Search", Proc. ICASSP 2019, Brighton, UK, May 2019, pp. 5781–5785 [pdf]
  8. T. Kinnunen, R. González Hautamäki, V. Vestman, M. Sahidullah, "Can We Use Speaker Recognition Technology to Attack Itself? Enhancing Mimicry Attacks Using Automatic Target Speaker Selection", Proc. ICASSP 2019, Brighton, UK, May 2019, pp. 6146–6150 [pdf]
  9. V. Vestman, T. Kinnunen, "Supervector Compression Strategies to Speed up I-vector System Development", Proc. Odyssey: The Speaker and Language Recognition Workshop, Les Sables d’Olonne, France, June 2018, pp. 357–364 [pdf]
  10. V. Vestman, D. Gowda, M. Sahidullah, P. Alku, T. Kinnunen, "Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions", Proc. Interspeech 2017, Stockholm, Sweden, August 2017, pp. 1512–1516 [pdf]
  11. K. A. Lee, V. Hautamäki, T. Kinnunen, A. Larcher, C. Zhang, A. Nautsch, T. Stafylakis, G. Liu, M. Rouvier, W. Rao, F. Alegre, J. Ma, M. W. Mak, A. K. Sarkar, H. Delgado, R. Saeidi, H. Aronowitz, A. Sizov, H. Sun, T. H. Nguyen, G. Wang, B. Ma, V. Vestman, M. Sahidullah, M. Halonen, A. Kanervisto, G. Le Lan, F. Bahmaninezhad, S. Isadskiy, C. Rathgeb, C. Busch, G. Tzimiropoulos, Q. Qian, Z. Wang, Q. Zhao, T. Wang, H. Li, J. Xue, S. Zhu, R. Jin, T. Zhao, P.-M. Bousquet, M. Ajili, W. B. Kheder, D. Matrouf, Z. H. Lim, C. Xu, H. Xu, X. Xiao, E. S. Chng, B. Fauve, K. Sriskandaraja, V. Sethu, W. W. Lin, D. A. L. Thomsen, Z.-H. Tan, M. Todisco, N. Evans, H. Li, J. H. L. Hansen, J.-F. Bonastre, E. Ambikairajah, "The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016", Proc. Interspeech 2017, Stockholm, Sweden, August 2017, pp. 1328–1332 [pdf]
  12. A. Kanervisto, V. Vestman, M. Sahidullah, V. Hautamäki, T. Kinnunen, "Effects of Gender Information in Text-Independent and Text-Dependent Speaker Verification", Proc. ICASSP 2017, New Orleans, US, March 2017, pp. 5360–5364 [pdf]

Theses

  1. V. Vestman, "Methods for fast, robust, and secure speaker recognition", Ph.D. thesis, Computer Science, 2020 [pdf]
  2. V. Vestman, "Modeling temporal characteristics of line spectral frequencies with an application to automatic speaker verification", Master’s thesis, Computer Science, 2016 [pdf]
  3. V. Vestman, "Fourier-menetelmät osittaisdifferentiaaliyhtälöissä", Master’s thesis, Mathematics, 2013 (in Finnish) [pdf]