Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Implementing SAPI training is relatively hard, and the documentation doesn’t really tell you what you need to know. </p> <p><a href="http://msdn.microsoft.com/en-us/library/ms718576(VS.85).aspx" rel="noreferrer">ISpRecognizer2::SetTrainingState</a> switches the recognizer into or out of training mode. </p> <p>When you go into training mode, all that really happens is that the recognizer gives the user a lot more leeway about recognitions. So if you’re trying to recognize a phrase, the engine will be a lot less strict about the recognition.</p> <p>The engine doesn’t really do any adaptation until you leave training mode, and you have set the fAdaptFromTrainingData flag.</p> <p>When the engine adapts, it scans the training audio stored under the profile data. It’s the training code’s responsibility to put new audio files where the engine can find it for adaptation.</p> <p>These files also have to be labeled, so that the engine knows what was said.</p> <p>So how do you do this? You need to use three lesser-known SAPI APIs. In particular, you need to get the profile token using <a href="http://msdn.microsoft.com/en-us/library/ms718612(VS.85).aspx" rel="noreferrer">ISpRecognizer::GetObjectToken</a>, and <a href="http://msdn.microsoft.com/en-us/library/ms718252(VS.85).aspx" rel="noreferrer">SpObjectToken::GetStorageFileName</a> to properly locate the file.</p> <p>Finally, you also need to use <a href="http://msdn.microsoft.com/en-us/library/ms719553(VS.85).aspx" rel="noreferrer">ISpTranscript</a> to generate properly labeled audio files.</p> <p>To put it all together, you need to do the following (pseudo-code):</p> <p>Create an inproc recognizer &amp; bind the appropriate audio input.</p> <p>Ensure that you’re retaining the audio for your recognitions; you’ll need it later.</p> <p>Create a grammar containing the text to train.</p> <p>Set the grammar’s state to pause the recognizer when a recognition occurs. (This helps with training from an audio file, as well.)</p> <p>When a recognition occurs:</p> <p>Get the recognized text and the retained audio.</p> <p>Create a stream object using CoCreateInstance(CLSID_SpStream).</p> <p>Create a training audio file using <a href="http://msdn.microsoft.com/en-us/library/ms718612(VS.85).aspx" rel="noreferrer">ISpRecognizer::GetObjectToken</a>, and <a href="http://msdn.microsoft.com/en-us/library/ms718252(VS.85).aspx" rel="noreferrer">ISpObjectToken::GetStorageFileName</a> , and bind it to the stream (using <a href="http://msdn.microsoft.com/en-us/library/ms719484(VS.85).aspx" rel="noreferrer">ISpStream::BindToFile</a>).</p> <p>Copy the retained audio into the stream object.</p> <p>QI the stream object for the <a href="http://msdn.microsoft.com/en-us/library/ms719553(VS.85).aspx" rel="noreferrer">ISpTranscript</a> interface, and use <a href="http://msdn.microsoft.com/en-us/library/ms719554(VS.85).aspx" rel="noreferrer">ISpTranscript::AppendTranscript</a> to add the recognized text to the stream.</p> <p>Update the grammar for the next utterance, resume the recognizer, and repeat until you’re out of training text.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload