Note that there are some explanatory texts on larger screens.

plurals
  1. POSaving audio input of Android Stock speech recognition engine
    primarykey
    data
    text
    <p>I am trying to save in a file the audio data listened by speech recognition service of android.</p> <p>Actually I implement <code>RecognitionListener</code> as explained here: <a href="https://stackoverflow.com/questions/5913773/speech-to-text-on-android">Speech to Text on Android</a></p> <p>save the data into a buffer as illustrated here: <a href="https://stackoverflow.com/questions/5925657/capturing-audio-sent-to-googles-speech-recognition-server">Capturing audio sent to Google&#39;s speech recognition server</a></p> <p>and write the buffer to a Wav file, as in here. <a href="https://stackoverflow.com/questions/7336570/android-record-raw-bytes-into-wave-file-for-http-streaming">Android Record raw bytes into WAVE file for Http Streaming</a></p> <p>My problem is how to get appropriate audio settings to save in the wav file's headers. In fact when I play the wav file only hear strange noise, with this parameters,</p> <pre><code>short nChannels=2;// audio channels int sRate=44100; // Sample rate short bSamples = 16;// byteSample </code></pre> <p>or nothing with this:</p> <pre><code>short nChannels=1;// audio channels int sRate=8000; // Sample rate short bSamples = 16;// byteSample </code></pre> <p>What is confusing is that looking at parameters of the speech recognition task from logcat I find first <strong>Set PLAYBACK sample rate to 44100 HZ</strong>:</p> <pre><code> 12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Set PLAYBACK PCM format to S16_LE (Signed 16 bit Little Endian) 12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Using 2 channels for PLAYBACK. 12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Set PLAYBACK sample rate to 44100 HZ 12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Buffer size: 2048 12-20 14:41:34.007: DEBUG/AudioHardwareALSA(2364): Latency: 46439 </code></pre> <p>and then <strong>aInfo.SampleRate = 8000</strong> when it plays the file to send to google server:</p> <pre><code> 12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::InitWavParser 12-20 14:41:36.152: DEBUG/(2364): File open Succes 12-20 14:41:36.152: DEBUG/(2364): File SEEK End Succes ... 12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::ReadData 12-20 14:41:36.152: DEBUG/(2364): Data Read buff = RIFF? 12-20 14:41:36.152: DEBUG/(2364): Data Read = RIFF? 12-20 14:41:36.152: DEBUG/(2364): PV_Wav_Parser::ReadData 12-20 14:41:36.152: DEBUG/(2364): Data Read buff = fmt ... 12-20 14:41:36.152: DEBUG/(2364): PVWAVPARSER_OK 12-20 14:41:36.156: DEBUG/(2364): aInfo.AudioFormat = 1 12-20 14:41:36.156: DEBUG/(2364): aInfo.NumChannels = 1 12-20 14:41:36.156: DEBUG/(2364): aInfo.SampleRate = 8000 12-20 14:41:36.156: DEBUG/(2364): aInfo.ByteRate = 16000 12-20 14:41:36.156: DEBUG/(2364): aInfo.BlockAlign = 2 12-20 14:41:36.156: DEBUG/(2364): aInfo.BitsPerSample = 16 12-20 14:41:36.156: DEBUG/(2364): aInfo.BytesPerSample = 2 12-20 14:41:36.156: DEBUG/(2364): aInfo.NumSamples = 2258 </code></pre> <p>So, how can I find out the right parameters to save the audio buffer in a good wav audio file?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload