StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POJava algorithm for normalizing audio
primarykey
Id
12469361
data
AcceptedAnswerId
12473877
AnswerCount
2
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2012-09-18T01:53:26.340
FavoriteCount
7
LastActivityDate
2012-09-19T02:08:55.167
LastEditDate
2012-09-18T08:27:53.733
LastEditorUserId
59015
OwnerUserId
59015
ParentId
0
PostTypeId
1
Score
12
ViewCount
6300
LastEditorDisplayName
text
Body
<p>I'm trying to normalize an audio file of speech.</p> <p>Specifically, where an audio file contains peaks in volume, I'm trying to level it out, so the quiet sections are louder, and the peaks are quieter.</p> <p>I know very little about audio manipulation, beyond what I've learnt from working on this task. Also, my math is embarrassingly weak.</p> <p>I've done some research, and the Xuggle site provides a sample which shows reducing the volume using the following code: (<a href="http://build.xuggle.com/view/Stable/job/xuggler_jdk5_stable/ws/workingcopy/src/com/xuggle/mediatool/demos/ModifyAudioAndVideo.java">full version here</a>)</p> <pre><code>@Override public void onAudioSamples(IAudioSamplesEvent event) { // get the raw audio byes and adjust it's value ShortBuffer buffer = event.getAudioSamples().getByteBuffer().asShortBuffer(); for (int i = 0; i < buffer.limit(); ++i) buffer.put(i, (short)(buffer.get(i) * mVolume)); super.onAudioSamples(event); } </code></pre> <p>Here, they modify the bytes in <code>getAudioSamples()</code> by a constant of <code>mVolume</code>.</p> <p>Building on this approach, I've attempted a normalisation modifies the bytes in <code>getAudioSamples()</code> to a normalised value, considering the max/min in the file. (See below for details). I have a simple filter to leave "silence" alone (ie., anything below a value).</p> <p>I'm finding that the output file is <strong><em>very</em></strong> noisy (ie., the quality is seriously degraded). I assume that the error is either in my normalisation algorithim, or the way I manipulate the bytes. However, I'm unsure of where to go next.</p> <p>Here's an abridged version of what I'm currently doing.</p> <h3>Step 1: Find peaks in file:</h3> <p>Reads the full audio file, and finds this highest and lowest values of <code>buffer.get()</code> for all AudioSamples</p> <pre><code> @Override public void onAudioSamples(IAudioSamplesEvent event) { IAudioSamples audioSamples = event.getAudioSamples(); ShortBuffer buffer = audioSamples.getByteBuffer().asShortBuffer(); short min = Short.MAX_VALUE; short max = Short.MIN_VALUE; for (int i = 0; i < buffer.limit(); ++i) { short value = buffer.get(i); min = (short) Math.min(min, value); max = (short) Math.max(max, value); } // assign of min/max ommitted for brevity. super.onAudioSamples(event); } </code></pre> <h3>Step 2: Normalize all values:</h3> <p>In a loop similar to step1, replace the buffer with normalized values, calling:</p> <pre><code> buffer.put(i, normalize(buffer.get(i)); public short normalize(short value) { if (isBackgroundNoise(value)) return value; short rawMin = // min from step1 short rawMax = // max from step1 short targetRangeMin = 1000; short targetRangeMax = 8000; int abs = Math.abs(value); double a = (abs - rawMin) * (targetRangeMax - targetRangeMin); double b = (rawMax - rawMin); double result = targetRangeMin + ( a/b ); // Copy the sign of value to result. result = Math.copySign(result,value); return (short) result; } </code></pre> <h2>Questions:</h2> <ul> <li>Is this a valid approach for attempting to normalize an audio file?</li> <li>Is my math in <code>normalize()</code> valid?</li> <li>Why would this cause the file to become noisy, where a similar approach in the demo code doesn't?</li> </ul>
Tags
<java><math><audio>
Title
Java algorithm for normalizing audio
singulars
PostAcceptedAnswerId
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USMarty Pitt
UserOwnerUserId
1. USMarty Pitt
plurals
PostLinksPostIdRelatedPostId
1. PL
  singulars
  LinkTypeLinkTypeId
  LTLinked
PostLinksRelatedPostIdPostId
1. PL
  singulars
  LinkTypeLinkTypeId
  LTLinked
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
  singulars
  PostTypePostTypeId
  PTAnswer
2. PO
  singulars
  PostTypePostTypeId
  PTAnswer
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  POJava algorithm for normalizing audio
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
2. VO
  singulars
  PostPostId
  POJava algorithm for normalizing audio
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
3. VO
  singulars
  PostPostId
  POJava algorithm for normalizing audio
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
CommentsPostId
1. CO"I'm trying to level it out, so the quiet sections are louder, and the peaks are quieter." - That's called [dynamic range compression](http://en.wikipedia.org/wiki/Dynamic_range_compression), which is not the same as normalization.
  singulars
  PostPostId
  POJava algorithm for normalizing audio
  UserUserId
  USJesper

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.