Event
M.S.Thesis Defense: Yexin Cao, TRF Estimation and Selective Auditory Attention Tracking
Friday, August 14, 2020
11:00 a.m.
Zoom (Audio- and video-enabled) meeting link: https://umd.zoom.us/j/95635631432?pwd=VnN0Uk5jbjFvUUpKRndTOWZwYjBuZz09
Maria Hoo
301 405 3681
mch@umd.edu
ANNOUNCEMENT: M.S.Thesis Defense
Name: Yexin Cao
Committee:
Professor Behtash Babadi, Chair/Advisor
Professor Jonathan Z. Simon
Professor Sennur Ulukus
Time/Date: Friday, August 14, 2020 starting from 11 a.m.
Location: Zoom (Audio- and video-enabled) meeting link:
https://umd.zoom.us/j/95635631432?pwd=VnN0Uk5jbjFvUUpKRndTOWZwYjBuZz09
Title: TRF Estimation and selective auditory attention tracking with deep Kalman filter
Abstract: Cocktail party effect refers to the phenomenon that people focus on a single speaking in a noisy environment with multiple speakers talking at the same time, as in a cocktail party. This effect reflects the human brain's ability of selective auditory attention, whose decoding from non-invasive electroencephalogram (EEG) or magnetoencephalography (MEG) has long been a topic of active research. The mapping between auditory stimuli and their neural responses can be measured by the auditory temporal response functions (TRF). It has been shown that the TRF estimates derived with the envelopes of speech streams and auditory neural responses can be used to make predictions that discriminate between attended and unattended speakers. regularized least squares estimator is adopted in previous research for the estimation of the linear TRF model. However, most of the real-world applications are nonlinear, which addresses the necessity of introducing a new model for complex, realistic auditory environments. In this paper, we estimated TRFs with the deep Kalman filter model, for the cases where the observations are a noisy, non-linear function of the latent states. Deep Kalman filter (DKF) algorithm is developed by referring to the techniques in variational inference, introducing a recognition network to approximate the intractable posterior and optimize the variational lower bound of the objective function. We implemented the deep Kalman filter model with a two-layer Bidirectional LSTM and a decoder MLP. The performance is first evaluated by applying the model to a simulated MEG dataset, reconstructing the TRFs with observed MEG. In addition, we also combined the new model for TRF estimation with a previously proposed framework by replacing the dynamic encoding/decoding module in the framework with a deep Kalman filter to conduct real-time tracking of selective auditory attention. This performance is validated by applying the general framework to the simulated EEG dataset.