Bio

I am a forth-year Ph.D. student in Computer Science at the University of Illinois Urbana-Champaign, advised by Prof. Minje Kim. Before my Ph.D. journey, I worked as a research assistant at Academia Sinica in Taiwan, contributing to the Bio-ASP Lab under the guidance of Dr. Yu Tsao. I hold an M.S. in Networking and Multimedia from National Taiwan University and a B.S. in Computer Science and Information Engineering from Chang Gung University.

My research sits at the intersection of machine learning and speech/audio. I develop front-end algorithms: speech enhancement, source separation, and target speaker extraction, with publications in ICASSP, Interspeech, and related venues. I am now extending my expertise to conversational speech processing, where robust front-end models can better support downstream understanding and interaction.

Selected Publications

Adaptive Deterministic Flow Matching for Target Speaker Extraction thumbnail

Adaptive Deterministic Flow Matching for Target Speaker Extraction

T.-A. Hsieh, M. Kim — submitted to ICASSP 2026.

An adaptive deterministic flow-matching approach for target speaker extraction achieving state-of-the-art performance.

Towards Real-Time Generative Speech Restoration with Flow-Matching thumbnail

Towards Real-Time Generative Speech Restoration with Flow-Matching

T.-A. Hsieh, S. Braun — submitted to ICASSP 2026.

Real-time, causal flow-matching speech restoration (10 ms algorithmic latency).

TGIF: Talker Group-Informed Familiarization thumbnail

TGIF: Talker Group-Informed Familiarization of Target Speaker Extraction

T.-A. Hsieh, M. Kim — WASPAA 2025.

Group-aware TSE that learns familiarity with a set of talkers (e.g., a family) instead of a single identity.

Multimodal Representation Loss Between Timed Text and Audio thumbnail

Multimodal Representation Loss Between Timed Text and Audio for Regularized Speech Separation

T.-A. Hsieh, H. Choi, M. Kim — Interspeech 2024.

Uses timed text–audio alignment as a regularizer to improve separation quality.

On The Importance of Neural Wiener Filter thumbnail

On The Importance of Neural Wiener Filter for Resource-Efficient Multichannel Speech Enhancement

T.-A. Hsieh, J. Donley, D. Wong, B. Xu, A. Pandey — ICASSP 2024.

Highlights Wiener filtering choices that matter for efficient multichannel enhancement.

Inference and Denoise: Causal Inference-based Neural Speech Enhancement thumbnail

Inference and Denoise: Causal Inference-based Neural Speech Enhancement

T.-A. Hsieh, C.-H. H. Yang, P.-Y. Chen, S. M. Siniscalchi, Y. Tsao — MLSP 2023 (Oral).

Incorporates causal inference ideas into neural speech enhancement.

Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement thumbnail

Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement

T.-A. Hsieh, C. Yu, S.-W. Fu, X. Lu, Y. Tsao — Interspeech 2021.

SSL-based feature loss for speech enhancement.

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement thumbnail

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement

S.-W. Fu, C. Yu, T.-A. Hsieh, P. Plantinga, M. Ravanelli, X. Lu, Y. Tsao — Interspeech 2021.

Adversarial training with a speech quality estimation network for speech enhancement.

Research Experiences

University of Illinois Urbana–Champaign

Research Assistant — Aug 2024–Present
Advisors: Prof. Minje Kim
Research: Speech enhancement, source separation, target speaker extraction, and conversational speech processing

Microsoft Research

Research Intern — May 2025–Aug 2025
Mentor: Dr. Sebastian Braun
Research: Real-time speech restoration using flow-matching

Meta Reality Labs, Audio

Research Scientist Intern — May 2023–Aug 2023
Mentor: Dr. Ashutosh Pandey
Research: Resource-efficient neural beamformer for speech enhancement

Academia Sinica

Research Assistant — Dec 2018–Jul 2022
Advisor: Dr. Yu Tsao
Research: Metric-oriented/agnostic objective function for speech enhancement

Education

University of Illinois Urbana–Champaign

Ph.D. in Computer Science — Aug 2024–Present
Advisors: Minje Kim, Paris Smaragdis
Fellowships: CS PhD Fellowship Addendum

Indiana University

Ph.D. Student in Intelligent Systems Engineering — Aug 2022–May 2024
Advisor: Minje Kim
Fellowship: Luddy Doctoral Summer Fellowship

National Taiwan University

M.S. in Networking and Multimedia — Sep 2016–Jun 2018
Advisor: Chiou-Shann Fuh
Thesis: 3D Face Identification and Reconstruction with Range Sensor

Chang Gung University

B.S. in Computer Science and Information Engineering — Sep 2011–Jun 2015

Academic Services

Reviewer

  • IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) — 2023–2025
  • Conference on Neural Information Processing Systems (NeurIPS) — 2024
  • IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) — 2023
  • IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) — 2023