Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>There is no simple way to recognize words, because they are basically a sequence of phonemes which can vary in time and frequency.</p> <p>Classical isolated word recognition systems use signal <a href="http://en.wikipedia.org/wiki/Mel-frequency_cepstral_coefficient" rel="nofollow">MFCC</a> (cepstral coefficients) as input data, and try to recognize patterns using HMM (hidden markov models) or DTW (dynamic time warping) algorithms.</p> <p>You will also need a silence detection module if you don't want a record button.</p> <p>For instance <a href="http://www.cstr.ed.ac.uk/projects/speech_tools/" rel="nofollow">Edimburgh University toolkit</a> provides some of these tools (with good documentation).</p> <p>If you don't want to build it "from scratch" or have a source of inspiration, <a href="http://www.isip.piconepress.com/projects/speech/index.html" rel="nofollow">here</a> is an (old but free) implementation of such a system (which uses its own toolkit) with a <a href="http://www.isip.piconepress.com/projects/speech/software/tutorials/production/fundamentals/current/" rel="nofollow">full explanation and practical examples</a> on how it works.</p> <p>This system is a LVCSR (Large-Vocabulary Continuous Speech Recognition) and you only need a subset of it. If someone know an open source reduced vocabulary system (like a simple IVR) it would be welcome.</p> <p>If you want to make a basic system from your own, I recommend you to use MFCC and DTW:</p> <ul> <li>For each target word to modelize: <ul> <li>record some instances of the word</li> <li>compute some (eg each 10ms) delta-MFCC through the word to have a model</li> </ul></li> <li>When you want to recognize a signal: <ul> <li>compute some delta-MFCC of this signal</li> <li>use DTW to compare these delta-MFCC to each modelized word's delta-MFCC</li> <li>output the word that fits the best (use a threshold to drop garbage)</li> </ul></li> </ul>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload