A probabilistic framework for acoustic-phonetic automatic speech recognition organizes a set of phonetic features into a hierarchy consisting of a broad manner feature sub-hierarchy and a fine phonetic feature sub-hierarchy. Each phonetic feature of said hierarchy corresponds to a set of acoustic correlates and each broad manner feature of said broad manner feature sub-hierarchy is further associated with a corresponding set of acoustic landmarks. A pattern recognizer is trained from a knowledge base of phonetic features and corresponding acoustic correlates. Acoustic correlates are extracted from a speech signal and are presented to the pattern recognizer. Acoustic landmarks are identified and located from broad manner classes classified by the pattern recognizer. Fine phonetic features are determined by the pattern recognizer at and around the acoustic landmarks. The determination of fine phonetic features may be constrained by a pronunciation model. The most probable feature bundles corresponding to words and sentences are those that maximize the joint a posteriori probability of the fine phonetic features and corresponding acoustic landmarks. When the hierarchy is organized as a binary tree, binary classifiers such as Support Vector Machines can be used in the pattern classifier and the outputs thereof can be converted probability measures which, in turn may be used in the computation of the aforementioned joint probability of fine phonetic features and corresponding landmarks.
U.S. Patent and Trademark Office DescriptionPTO
Inventor(s) Carol Espy-Wilson
, Amit Juneja