Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in IEEE Signal Processing Letters, 2020
In this letter, we propose an efficient end-to-end speech enhancement model, WaveCRN, which combines a CNN module for capturing speech locality features with a stacked SRU module for modeling sequential properties, using a novel restricted feature masking approach to achieve state-of-the-art performance with reduced complexity and faster inference.
Download here
Published in APSIPA ASC, 2020
TIn this study, we apply a modified Transformer architecture to speech enhancement by replacing positional encoding with convolutional layers and fine-tuning the model using a MetricGAN framework to boost perceptual quality (PESQ) scores, achieving significant improvements over the baseline in both subjective and objective evaluations on the DNS challenge datasets.
Download here
Published in Proc. Interspeech, 2021
In this study, we propose a phone-fortified perceptual loss (PFPL) for speech enhancement, leveraging phonetic information from the wav2vec model and utilizing the Wasserstein distance to improve speech quality and intelligibility, demonstrating superior performance compared to signal-level losses on standardized evaluations.
Download here
Published in MLSP, 2023
This study introduces a causal inference-based speech enhancement (CISE) framework that models noise presence as an intervention, using a noise detector and mask-based enhancement modules to perform noise-conditional speech enhancement, demonstrating improved performance and efficiency compared to non-causal and more complex SE models.
Download here
Published in Proc. ICASSP, 2024
In this work, we present a low-latency, computationally efficient time-domain framework for multichannel speech enhancement, featuring two compact deep neural networks (DNNs) surrounding a multichannel neural Wiener filter (NWF), which together achieve superior performance with fewer parameters and reduced computational demands.
Download here
Published in Proc. Interpseech, 2024
In this study, we introduce a timed text-based regularization method to enhance speech separation models by aligning audio and word embeddings using pretrained WavLM and BERT models, leading to improved separation performance without needing auxiliary text data during testing.
Download here
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.