Digital Speech Processing

Digital Speech Processing (4-6 op) 175317

Digitaalinen puheenkäsittely
The course is an optional laudatur-level course (special/advanced studies, erikoiskurssi).

Course description

Speech is the oldest and most efficient way of communication media between human beings - but not so simple for computers as one may think! This course gives an introduction to techniques for processing and analyzing digital speech and audio signals. The topics include spectral analysis methods, fundamental frequency and formant tracking, denoising, and feature extraction for recognition applications. Short introduction to pattern matching techniques such as dynamic time warping and Gaussian mixtures will also be given. These are the basic components in most speech technology applications, including speech and speaker recognition, speech coding, and synthesis.

News
          Final results. Please remember course feedback also. Thanks & Happy Easter to everyone! :)

        Final seminar: Tuesday 8.4.2009, at 11:15 - 13:00, T/B 180

          Instructions for the project (added 31.3.2009)

          The first exam: Thursday 19.3.2009 at 12-14 (sharp!) in T/D 106.
          Sample problems for practicing to the exam, with solutions

Teachers & material

Lectures: Tomi Kinnunen (E-mail: tkinnu@cs.joensuu.fi, Office: T/B 358)
Exercises: Rahim Saeidi (E-mail: rahim@cs.joensuu.fi, Office: T/B 356)

Schedule

Note! Course starts already at week 9, first lecture is on Wednesday 25.2. The course finishes on the week 15 (6.-12.4.2009), before Easter. There are two lectures and two exercise groups each week, amounting to ~26 hours of lectures and ~24 hours of exercises.

Schedule on week 9 (23.2 - 1.3.) is exceptional:

Lecture: Wednesday 25.2 at 14-16, T/D 106
Lecture: Thursday 26.2 at 10-12, T/D 106
Exercise: Thursday 26.2 at 16-18, T/B 247 (computer class)
Exercise: Friday 27.2 at 12-14, T/B 178 (computer class)

"Normal schedule" after the first week:

Lecture: Monday 12-14, 2D106B (Time and room changed!)
Lecture: Tuesday 10-12, T/D 106, except 17.3 at 12-14, 2D106B
Exercise: Monday 14-16, T/B 247 (computer class)
Exercise: Tuesday 14-16 T/B 247, except 10.3 in T/B 178 (computer class)

Last lecture (recap of the course) : Monday 6.4 at 10-12, T/D 106
Exam: Tuesday 7.4. at 16-19, T/D 106.

Preliminary knowledge

Sufficient knowledge of mathematics and programming. Knowledge of digital signal processing and Matlab will be helpful but not necessary. Course recommended for 3.-N. year students or for anyone who is motivated to learn!

Course implementation

The course will consist of lectures ("theory"), exercises ("practice") and a considerable number of exercises in Matlab language ("the real practice"). It is possible to make an additional 1-2 op research-oriented project work after the course.
Exercises will consist of both traditional paper-and-whiteboard exercises as well as computer exercices using Matlab. For the "traditional" exercises the 1/3 requirement applies, extra points up to 10% of the maximum exam points.
Some of the Matlab exercices will be compulsory and must be submitted before participating the exam. High weightage will be given to these practical exercises.
Grading (roughly) : 40% for the compulsory Matlab exercises, 60% for the exam. Extra points for making > 1/3 of standard exercises (up to 10% maximum).

Recommended literature

Well-suited for computer scientists: X. Huang, A. Acero, H.-W. Hon, Spoken Language Processing: a Guide to Theory, Algorithm, and System Development, Prentice Hall PTR, 2001.
Benesty, Jacob; Sondhi, M. M.; Huang, Yiteng (Eds.), Springer handbook of Speech Processing, 1176 pages, 2008.
Suitable for signal engineer: T.F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice, Prentice Hall PTR, 2002.
Deller, Hansen, Proakis: Discrete-Time Processing of Speech Signals, 2nd edition, IEEE Press, 2000.
J. Harrington and S. Cassidy, Techniques in Speech Acoustics, Kluwer, 1999.

To study the basics of DSP, I recommend DSPguide, free online book.