ProjectsProject Details

Speaker Diarization using Deep Learning

Project ID: 4840-1-18
Year: 2019
Student/s: Matanel Yaacov, Shay Avig
Supervisor/s: Nurit Spingarn
Award: Wilk award

Speaker Diarization is a process of dividing a given sound segment or audio stream into segments based on the speaker's identity.
This method is designed to answer the question "Who spoke and when?" And can be useful in many different cases where it is important to know the speaker's identity. For example, phone calls, radio interviews, podcasts, and even emergencies where recordings from the scene are investigated (black boxes in aircraft, etc. ...).
Speaker Diarization is the well-known and famous method for segmenting audio segments by speaker identity, which until today has been implemented by classical algorithms from audio signal processing. In this project we implemented a real-time diarization, based on machine learning technique.

Poster for Speaker Diarization using Deep Learning