ProjectsProject Details

Low Latency Voice Conversion

Project ID: 6904-1-22
Year: 2024
Student/s: Lior Bashari, Yonatan Kleerekoper
Supervisor/s: Yair Moshe

Voice Conversion (VC) involves modifying one or more aspects of a speech signal while preserving linguistic information. Deep learning-based voice conversion is a relatively new area that focuses mainly on improving quality but often suffers from high latency due to sequential computation and high computation complexity. The projects goal is to develop a deep learning-based VC with latency of up to 400 milliseconds suitable for real-time applications. We propose an approach based on low-latency QuickVC by Guo et al. Our solution uses 5-second windows with 250-millisecond delay on the first window, enabling real-time processing while maintaining high quality.

Poster for Low Latency Voice Conversion
Collaborators:
Logo of Elbit Systems Collaborator
Elbit Systems