Note that there are some explanatory texts on larger screens.

plurals
  1. POKinect speech recognition and skeleton tracking don't work together
    primarykey
    data
    text
    <p>I'm writing an application that can take several different external inputs (keyboard presses, motion gestures, speech) and produce similar outputs (for instance, pressing "T" on the keyboard will do the same thing as saying the word "Travel" out loud). Because of that, I don't want any of the input managers to know about each other. Specifically, I don't want the Kinect manager (as much as possible) to know about the Speech manager and vice versa, even though I'm using the Kinect's built-in microphone (the Speech manager should work with ANY microphone). I'm using System.Speech in the Speech manager as opposed to Microsoft.Speech.</p> <p>I'm having a problem where as soon as the Kinect motion recognition module is enabled, the speech module stops receiving input. I've tried a whole bunch of things like <a href="https://stackoverflow.com/questions/14046729/i-cant-get-kinect-sdk-to-do-speech-recognition-and-track-skeletal-data-at-the-s">inverting the skeleton stream and audio stream</a>, capturing the audio stream in different ways, etc. I finally narrowed down the problem: something about how I'm initializing my modules does not play nicely with how my application deals with events.</p> <p>The application works great until motion capture starts. If I completely exclude the Kinect module, this is how my main method looks:</p> <pre><code> // Main.cs public static void Main() { // Create input managers KeyboardMouseManager keymanager = new KeyboardMouseManager(); SpeechManager speechmanager = new SpeechManager(); // Start listening for keyboard input keymanager.start(); // Start listening for speech input speechmanager.start() try { Application.Run(); } catch (Exception ex) { MessageBox.Show(ex.StackTrace); } } </code></pre> <p>I'm using <code>Application.Run()</code> because my GUI is handled by an outside program. This C# application's only job is to receive input events and run external scripts based on that input. </p> <p>Both the keyboard and speech modules receive events sporadically. The Kinect, on the other hand, generates events constantly. If my gestures happened just as infrequently, a polling loop might be the answer with a wait time between each poll. However, I'm using the Kinect to control mouse movement... I can't afford to wait between skeleton event captures, because then the mouse would be very laggy; my skeleton capture loop needs to be as constant as possible. This presented a big problem, because now I can't have my Kinect manager on the same thread (or message pump? I'm a little hazy on the difference, hence why I think the problem lies here): from the way I understand it, being on the same thread would not allow keyboard or speech events to consistently get through. Instead, I kind of hacked together a solution where I made my Kinect manager inherit from <code>System.Windows.Forms</code>, so that it would work with <code>Application.Run()</code>. Now, my main method looks like this:</p> <pre><code> // Main.cs public static void Main() { // Create input managers KeyboardMouseManager keymanager = new KeyboardMouseManager(); KinectManager kinectManager = new KinectManager(); SpeechManager speechmanager = new SpeechManager(); // Start listening for keyboard input keymanager.start(); // Attempt to launch the kinect sensor bool kinectLoaded = kinectManager.start(); // Use the default microphone (if applicable) if kinect isn't hooked up // Use the kinect microphone array if the kinect is working if (kinectLoaded) { speechmanager.start(kinectManager); } else { speechmanager.start(); } try { // THIS IS THE PLACE I THINK I'M DOING SOMETHING WRONG Application.Run(kinectManager); } catch (Exception ex) { MessageBox.Show(ex.StackTrace); } </code></pre> <p>For some reason, the Kinect microphone loses its "default-ness" as soon as the Kinect sensor is started (if this observation is incorrect, or there is a workaround, PLEASE let me know). Because of that, I was required to make a special <code>start()</code> method in the Speech manager, which looks like this:</p> <pre><code> // SpeechManager.cs /** For use with the Kinect Microphone **/ public void start(KinectManager kinect) { // Get the speech recognizer information RecognizerInfo recogInfo = SpeechRecognitionEngine.InstalledRecognizers().FirstOrDefault(); if (null == recogInfo) { Console.WriteLine("Error: No recognizer information found on Kinect"); return; } SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(recogInfo.Id); // Loads all of the grammars into the recognizer engine loadSpeechBindings(recognizer); // Set speech event handler recognizer.SpeechRecognized += speechRecognized; using (var s = kinect.getAudioSource().Start() ) { // Set the input to the Kinect audio stream recognizer.SetInputToAudioStream(s, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null)); // Recognize asycronous speech events recognizer.RecognizeAsync(RecognizeMode.Multiple); } } </code></pre> <p>For reference, the <code>start()</code> method in the Kinect manager looks like this:</p> <pre><code> // KinectManager.cs public bool start() { // Code from Microsoft Sample kinect = (from sensorToCheck in KinectSensor.KinectSensors where sensorToCheck.Status == KinectStatus.Connected select sensorToCheck).FirstOrDefault(); // Fail elegantly if no kinect is detected if (kinect == null) { connected = false; Console.WriteLine("Couldn't find a Kinect"); return false; } // Start listening kinect.Start(); // Enable listening for all skeletons kinect.SkeletonStream.Enable(); // Obtain the KinectAudioSource to do audio capture source = kinect.AudioSource; source.EchoCancellationMode = EchoCancellationMode.None; // No AEC for this sample source.AutomaticGainControlEnabled = false; // Important to turn this off for speech recognition kinect.AllFramesReady += new EventHandler&lt;AllFramesReadyEventArgs&gt;(allFramesReady); connected = true; return true; } </code></pre> <p>So when I disable motion capture (by having my main() look similar to the first code segment), speech recognition works fine. When I enable motion capture, motion works great but no speech gets recognized. In both cases, keyboard events always work. There are no errors, and through tracing I found out that all the data in the speech manager is initialized correctly... it <em>seems</em> like the speech recognition events just disappear. How can I reorganize this code so that the input modules can work independently? Do I use threading, or just Application.Run() in a different way? </p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload