Friday, May 8, 2020 - 4:00pm to 5:00pm
Location:Virtual Presentation Remote Access Enabled - Zoom
Speaker:TIANJUN MA, Masters Student /TIANJUN%20MA
Monaural Audio Source Separation in the Wild
Monaural audio source separation remains to be one of the most challenging tasks in the signal processing domain. The task can be described as finding a function that takes in a single-channel audio stream containing mixed source signals and returns the disaggregated source signals from the mixture. In recent years, many supervised learning algorithms have been proposed and are shown to perform well in monaural audio source separation under laboratory settings where mixtures consist of a constrained set of sound categories. We want to extend the source separation setting to more challenging real-world scenarios where audio mixtures contain naturally occurring sounds of diverse categories. In this paper, we present our audio data collection process and introduce Wild-mix Synthesis Toolkit, a library we build to create source separation datasets emulating real-world auditory scenes. We also present Attentive Spatio-temporal Disaggregator (ASTD), a deep neural network-based model that achieves state-of-the-art performance on both synthesized real-world settings and specialized settings including speech denoising and speaker separation.
Louis-Philippe Morency (Chair)
Zoom Participation Enabled. See announcement.