StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POIssue distinguishing commands from normal speech with SAPI
primarykey
Id
7955625
data
AcceptedAnswerId
7966918
AnswerCount
4
ClosedDate
CommentCount
5
CommunityOwnedDate
CreationDate
2011-10-31T15:12:18.390
FavoriteCount
2
LastActivityDate
2012-03-02T16:11:27.493
LastEditDate
2011-11-02T13:08:31.223
LastEditorUserId
612541
OwnerUserId
612541
ParentId
0
PostTypeId
1
Score
1
ViewCount
2053
LastEditorDisplayName
text
Body
I'm working on a personal project involving microphones in my apartment that I can issue verbal commands to. To accomplish this, I've been using the Microsoft Speech API, and specifically RecognitionEngine from System.Speech.Recognition in C#. I construct a grammar as follows: <pre><code>// validCommands is a Choices object containing all valid command strings // recognizer is a RecognitionEngine GrammarBuilder builder = new GrammarBuilder(recognitionSystemName); builder.Append(validCommands); recognizer.SetInputToDefaultAudioDevice(); recognizer.LoadGrammar(new Grammar(builder)); recognizer.RecognizeAsync(RecognizeMode.Multiple); // etc ... </code></pre> This seems to work pretty well for the case when I actually give it a command. It hasn't misidentified one of my commands yet. Unfortunately, it also tends to pick up random talking as commands! I've tried to ameliorate this by prefacing the command Choices object with a "name" (recognitionSystemName), which I address the system as. Oddly, this doesn't seem to help. I am restricting it to a set of predetermined command phrases, so I would have thought that it would be able to detect if speech wasn't any of the strings. My best guess is that it's assuming that all sound is a command and picking the best match from the command set. Any advice on improving this system so that it no longer triggers off of conversation not directed at it would be very helpful. Edit: I've moved the name recognizer to a separate SpeechRecognitionEngine, but the accuracy is awful. Here's a bit of test code I wrote to examine the accuracy: <pre><code>using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Tasks; using System.Speech.Recognition; namespace RecognitionAccuracyTest { class RecognitionAccuracyTest { static int recogcount; [STAThread] static void Main() { recogcount = 0; System.Console.WriteLine("Beginning speech recognition accuracy test."); SpeechRecognitionEngine recognizer; recognizer = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US")); recognizer.SetInputToDefaultAudioDevice(); recognizer.LoadGrammar(new Grammar(new GrammarBuilder("Octavian"))); recognizer.SpeechHypothesized += new EventHandler<SpeechHypothesizedEventArgs>(recognizer_SpeechHypothesized); recognizer.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized); recognizer.RecognizeAsync(RecognizeMode.Multiple); while (true) ; } static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e) { System.Console.WriteLine("Recognized @ " + e.Result.Confidence); try { if (e.Result.Audio != null) { System.IO.FileStream stream = new System.IO.FileStream("audio" + ++recogcount + ".wav", System.IO.FileMode.Create); e.Result.Audio.WriteToWaveStream(stream); stream.Close(); } } catch (Exception) { } } static void recognizer_SpeechHypothesized(object sender, SpeechHypothesizedEventArgs e) { System.Console.WriteLine("Hypothesized @ " + e.Result.Confidence); } } } </code></pre> If the name is "Octavian", it recognizes stuff like "Octopus", "Octagon", "Volkswagen", and "Wow, really?". I can clearly hear the difference in the associated audio clips. Any ideas on making this not awful would be great.
Tags
<c#><speech-recognition><sapi><speech-to-text><noise>
Title
Issue distinguishing commands from normal speech with SAPI
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USOctavianus
UserOwnerUserId
1. USOctavianus
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
3. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POIssue distinguishing commands from normal speech with SAPI
 UserUserId
 USEric Smekens
 VoteTypeVoteTypeId
 VTFavorite
2. VO
 singulars
 PostPostId
 POIssue distinguishing commands from normal speech with SAPI
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POIssue distinguishing commands from normal speech with SAPI
 UserUserId
 USXeon
 VoteTypeVoteTypeId
 VTFavorite
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.