The Institute for Systems Research
Speech Communication Lab
Carol Espy-Wilson is a Professor in the Electrical and Computer Engineering Department and the Institute for Systems Research at the University of Maryland. View or download Dr. Espy-Wilson's CV here.
Dr. Espy-Wilson received her B.S. in Electrical Engineering from Stanford University. She received her M.S., E.E. and Ph.D. degrees in Electrical Engineering from the Massachusetts Institute of Technology. Prior to joining the faculty at the University of Maryland, Dr. Espy-Wilson was a faculty member at Boston University.
Dr. Espy-Wilson's research is in speech communication. She combines knowledge of digital signal processing, speech science, linguistics, acoustic phonetics and machine learning to conduct interdisciplinary research in speech and speaker recognition, speech production, speech enhancement and single-channel speech segregation. She also analyzes speech as a behavioral signal for emotion recognition, sentiment analysis and the detection and monitoring of mental health.
Her company, OmniSpeech, translated research in her lab on noise suppression and speech enhancement to technology that improves speech-enabled technology in any device, app or platform.
Dr. Espy-Wilson has authored or coauthored numerous papers in journals, conference proceedings and books. She is a Fellow of the International Speech Communication Association (ICSA) and the Acoustical Society of America (ASA) and a Senior Member of IEEE. She was a Radcliffe Fellow at Harvard 2008–2009. Among the other honors and awards she has received for her research contributions are the Clare Boothe Luce Professorship in 1990, the Independent Scientist Award from the National Institutes of Health in 1998 and the Honda Initiation Award in 2003. She served as the chair of the Speech Technical Committee of the Acoustical Society of America (2007-2010) and as an Associate Editor of the ASA's magazine, Acoustics Today. She was a member of the National Advisory Board for Medical Rehabilitation at the National Institutes of Health. Currently, she is a member of the Advisory Council for the NIH National Institutes on Biomedical Imaging and Bioengineering and an Associate Editor of the Journal of the Acoustical Society of America. In 2019 she chaired the NSF Speech for Robotics Workshop.
- Fellow, Institute of Electrical and Electronics Engineers (IEEE) (2021)
- Editorial Board, Computer, Speech and Language (2021–)
- Advisory Council, National Institute on Deafness and other Communication Disorders, NIH (2019-present)
- University of Maryland Campus Woman of Influence (2020)
- First African American woman, and first African American, in ECE to achieve tenure and be promoted to the rank of full professor (University of Maryland First to ADVANCE Program, 2019)
- Jimmy Lin Award for Innovation (2018)
- Fellow of the International Speech Communication Association (2018)
- Associate Editor, Journal of the Acoustical Society of America
- Advisory Council, NIH National Institutes on Biomedical Imaging and Bioengineering (2015-2018)
- Institute for Systems Research Senior Faculty Fellow Award (2015-2017)
- Distinguished Scholar-Teacher Award, University of Maryland (2012-2013)
- Advance Professor, University of Maryland (2011-2012)
- Elected to the Speech and Language Technical Committee of IEEE (2010-2012)
- Invention of the Year Award, University of Maryland (2010)
- Maryland Innovator of the Year Award, Baltimore Daily Record (2010)
- Grand Prize, Rockville Economic Development Inc. (REDI) StartRight! Women’s Business Plan Competition, 2010
- $50,000 SAIC-VentureAccelerator Competition, 2010
- University of Maryland $75K Business Plan Competition (High Technology & Social Impact), 2010
- Invention of the Year (Information Science): OmniSpeech, 2010
- Chair, Speech Communication Technical Committee, Acoustical Society of America (2007-2010)
- Editorial Board, Acoustics Today, Acoustical Society of America (2007-2009)
- Fellow, Radcliffe Institute for Advanced Study, Harvard University (2008)
- Fellow of the Acoustical Society of America (2005)
- Honda Initiation Award (2003)
- Honda Initiation Award (2004)
- Member, NIH Language and Communication Study Section (2001-2004)
- NIH Career Award (1998-2003)
- Clare Boothe Luce Professor (1990-1995)
Integration of engineering, linguistics, speech science and machine learning to study speech communication and develop robust speech technologies. Digital signal process, speech science, speech enhancement and segregation, noise robust automatic speech recognition, assistive technologies.
- C. Espy-Wilson, G. Sivaraman, M. Tiede, V. Mitra, E. Saltzmann, L. Goldstein, H. Nam (in press), “Modeling of Articulatory Gestures to Control Effects of Production Variability on Speech Technologies”. In Cangemi, Clayards, Niebuhr, Schupler & Zellers (eds). Rethinking Reduction, Berlin: Mouton de Gruyter, 2018.
- Vikramjit Mitra, Ganesh Sivaraman, Hosung Nam, Carol Espy-Wilson, Elliot Saltzman, Mark Tiede, “Hybrid Convolutional Neural Networks For Articulatory And Acoustic Information Based Speech Recognition”, Speech Communication, Vol 89, Issue C, pp. 103-112, 2017.
- V. Mitra, H. Nam, C. Espy-Wilson, E. Saltzman, and L. Goldstein, “Recognizing articulatory gestures from speech for robust speech recognition”, Journal of the Acoustical Society of America, vol. 131, no. 3, pp. 2270-2287, 2012.
- X. Zhou, C. Espy-Wilson, S. Boyce, M. Tiede, Christy Holland and Ann Choe
- “A magnetic resonance imaging-based articulatory and acoustic study of “retroflex” and “bunched” American English /r/ sounds”, Journal of the Acoustical Society of America, Vol. 123, no. 6, pp. 4466-4481, 2008.
- A. Juneja and Carol Espy-Wilson, “Probabilistic landmark detection for automatic speech recognition using acoustic-phonetic information”, Journal of the Acoustical Society of America, vol. 123, no. 2, pp. 1154-1168, 2008.
- T. Pruthi, C. Espy-Wilson and Brad Story, “Simulation and analysis of nasalized vowels based on MRI data”, Journal of the Acoustical Society of America, vol. 121, no. 6, pp. 3858-3873, 2007.
- O. Deshmukh, C. Espy-Wilson, L. Carney, “Speech Enhancement using the Modified Phase Opponency Model”, Journal of the Acoustical Society of America, vol. 121, no. 6, pp. 3886-3898, 2007
- C. Y. Espy-Wilson, S. Boyce, M. Jackson, S. Narayanan and A. Alwan, “Acoustic Modeling of American English /r/”, Journal of the Acoustical Society of America, pp. 343-356, 2000.
- C. Y. Espy-Wilson, V. R. Chari, J. M. MacAuslan, C. B. Huang and M. J. Walsh “Enhancement of Electrolaryngeal Speech by Adaptive Filtering”, Journal of Speech, Language and Hearing Research, vol. 41, no. 6, December, pp. 1253-1264, 1998
- F. Guenther, C. Espy-Wilson, S. Boyce, M. Matthies, M. Zandipour and J. Perkell “Articulatory tradeoffs reduce acoustic variability during American English /r/ production”, Journal of the Acoustical Society of America, vol. 105, no. 5, pp. 2854-2865, 1999
- C. Y. Espy-Wilson (1994) “A Feature-Based Semivowel Recognition System,” Journal of the Acoustical Society of America, vol. 96, no. 1, pp. 65-72, 1992
C. Y. Espy-Wilson, “Acoustic Measures for Linguistic Features Distinguishing the Semivowels /wjrl/ in American English,” Journal of the Acoustical Society of America, vol. 92, no. 2, pp. 736-751, 1992.
International Speech Communication Association
- Fellow, 2018
Acoustical Society of America
- Fellow, 2005
Institute of Electrical and Electronics Engineers (IEEE)
- Pitch detection algorithm based on PWVT of teager energy operator
- Systems and Methods for Speech Extraction
- Multiple pitch extraction by strength calculation from extrema
- Systems and methods for multiple pitch tracking using a multidimensional function and strength values
- System and method for automatic speech recognition from phonetic features and acoustic landmarks
- System and method for automatic speech recognition from phonetic features and acoustic landmarks
- Electrolaryngeal Speech Enhancement for Telephony
- Process to Introduce Realistic Pitch Variation in Artificial Larynx Speech
- Using Multi-Stage Learning to Prioritize Mental Health
- NSF IIS: Neuromorphic and Data-Driven Speech Segregation
- NSF Collaborative Research: Effects of production variability on the acoustic consequences of coordinated articulatory gestures
- Speech Processing Algorithms for Elderly Listeners with Hearing Loss
- Multilingual Gestural Models for Robust Language-Independent Speech Recognition
- NSF CIF: Nonintrusive Digital Speech Forensics: Source Identification and Content Authentication
- Advanced speech enhancement software
- Predictors of Speech Quality after Tongue Cancer Surgery
- NSF RI: Extension of the APP detector for multipitch tracking and speaker separation
- NSF RI Collaborative Research: Landmark-based Robust Speech Recognition Using Prosody-Guided Models of Speech Variability
- Adversarial Auto-encoders for Speech Emotion Recognition
- Articulatory representations to address acoustic variability in speech
- Effects of depression on speech
- Speech production information for enhanced speech recognition
- Biobehavioral markers of depression: Integrated signal processing of speech, facial expressions and physiology
- A study of emotional cues in speech
- Multi-faceted research in Speech Communications
- An Enhancement of Modified Phase Opponency for Noise-Robust Speech Recognition
- Speech Segregation from Co-Channel Mixtures
- From Acoustics to Vocal-Tract Time Functions
- Speech Enhancement for Noise—Robust Speech Recognition
- Language Detection for Music Information Retrieval
- Landmark-Based Speech Recognition
- A Novel Speaker Verification System using Samples-Based Acoustical Models
- Synergy of Acoustic-Phonetics and Peripheral Auditory Modeling Towards Robust Speech Recognition
- Acoustic Parameters for Automatic Detection of Nasal Manner
- Acoustic-phonetic Approach to Speech Recognition Based on Landmark Detection