Mastering an audiobook

  • Thread starter Thread starter davidzweig
  • Start date Start date
D

davidzweig

New member
I'm doing some work for a friend who is producing an audiobook. They have done the recording with a Zoom H2 portable recorder, recording with a pop filter in a wood lined sauna. It seems to keep the echo down.

I have some experience working with with Adobe Audition, mostly from restoring old tape recordings. I will explain what I've done to try and master this vocal recording, but it's my first one, and there's probably something that can be done better. I would appreciate any feedback. Thanks. Ok:

1. Raw WAV file is run through Izotope RX2 to remove noise. Removes 25db of noise (hiss), algorithm D, adaptive learning time 5 seconds. Works well. Output is in the archive below (denoised.wav)

2. Import into Adobe Audition 4.0. Apply parametic EQ. Settings are here:
www!omilia!org/hosted_files/eq.jpg (change '!' to '.')
Cutting at 517Hz is an attempt to reduce the 'boxy' resonance.

3. Normalise to -0.1db

4. Hard limit to -0.1db whilst increasing gain by 7db:
www!omilia!org/hosted_files/hl7.jpg

5. Run the Desser. Here is a frequency analysis of a 'ssss' sound by the reader:
www!omilia!org/hosted_files/freqa.jpg
I had set the settings to centre freq: 10500 Hz, bandwidth: 6000 Hz, threshold: -18db, multiband. I don't know if this is set up right or not, or if it's needed.

6. Export as 96kps CBR 44.1Khz mono MP3. (included in the archive as 02.mp3)

All the files are available here:
www!omilia!org/hosted_files/editing.zip

I would appreciate it if someone could take a look at these, and let me know what could done better.

Thanks a lot.

David
 
Without listening (not in a position to at the moment)... Just a couple thought off the top o' me head...

1) The only room I can think of worse than a tile bathroom to record a human voice in would be a small, wood-lined space like a closet or a sauna. I can't even imagine the resonance, but I'd be it's quite impressive.

2) Twenty-five dB of noise reduction seems about 22dB more excessive than it should be. Unless it's removing the resonance in the wood reflection chamber, which I suppose could easily seem like it'd be 25dB over the source.

3) Normalizing...?

4) Limiting after normalizing??

5) Doing anything at all (de-essing, EQ'ing, whatever) to a source that's been normalized and hard-limited?!?

6) 96kbps? The "mono" part I understand. But 160kbps (which still falls into the "pretty crappy" factor, but should be fine for a mono, voice-only recording assuming the converter isn't injecting a lot of "space-monkeys" into the top end) seems like a pretty reasonable place to be if you're actually interested in any semblance of sound quality...

Let's put it this way -- And again, it's only an assumption (but an assumption from 30-ish years of doing this sort of stuff). Even if the recording sounds half-decent, it could probably be far and away better simply by doing things a little differently.
 
Thanks for the reply. I appreciate any help. In response:

1) I've attempted voice recordings before in various places, with a few different Samson USB condenser mics, sometimes covering the walls in duvets. Never in a studio. My results have varied. But this recording doesn't seem to suffer from a bad environment.

2) iZotope does a really good job on the noise and doesn't compromise the original signal much. It perhaps has a different 'scale' for reducing noise, it sounds similar to removing about 10-15 dB in Audition, but with less artefacts.

3/4) Normalising and Limiting, yes, to chop 7db off the loudest volume spikes, and increase the gain to maximum without clipping. I don't see the problem? I had experimented with using a compressor, but it didn't really seem necessary, as the level is fairly constant anyway, apart from the occasional spike.

5) A desser reduces the gain, and is itself dependant on the level of the signal going in (threshold). So I wanted to have a more or less the same level going in over all the tracks, and use the same desser settings. Which is why I did this part after normalising and limiting. I don't really see a problem to being close to clipping, if the amplitude is only being reduced?

6) To be honest, with a clean voice recording, no music, I can't hear much benefit in going beyond 56kps CBR MP3. If there's noise or music, it's different story. This is with listening carefully with a cheap pair of Samson monitor headphones. With crappy earbuds in a noisy environment (anticipated use), I can imagine the benefit of higher bitrates is further diminished. So 96kps seems very generous to me.

Would appreciate it if someone could listen to the de-noised wav, and tell me if they think the deesser is needed or not.
 
Last edited:
I'm largely with Massive Master here. If you need 25dB of noise reduction, there is something fundamentally wrong with your recordings. Noise reduction is a remedial process for emergencies, not a fundamental step--and 25dB is a HUGE amount of noise reduction. What's causing the noise? Is it some form of room/ventilation noise going on or is electronic noise indicating a fault with your mic or cable?

Whichever, the first thing I'd do is identify where the noise is coming from and eliminate it at source rather than after the recording.

Second, I'd do any EQ, de-essing or other processing you deem necessary and make the normalise/hard limit the very last step. I'm not quite as down as Massive Master on the normalise/hard limit sequence because I've used it myself on voice. I normalise to near the 0dB level then visually look at the difference between the spikes and the "average". I then set the limiter to cut the most obvious big peaks but not take the life out of things.

I downloaded your files and, for me viewing in Audition 3.01 your "final" 02 file is seriously over limited and, in the conversion to low bit rate MP3, has started clipping on almost every peak--which with the limiting, means constantly. On my studio monitors, the difference between the "de-noised" and the "02" files are like chalk and cheese--with the "final" version being by far the worse. I also tried them on "cheap" headphones (probably more typical of the audience) and even there, the "de-noised" sounded okay but the "02" is so tinny and clipped as to be very fatiguing to listen to, even over your short clip.

Is there a technical reason (limited storage maybe) for wanting the lower bit rate? If not, I'd serious rethink here. Other audio books I listen to don't seem nearly as bit rate limited as your efforts.

FYI, I took your "de-noised" file, normalised to -0.3dB, hard limited to the same level with a 6dB boost, then saved as a mono 128kbps CBR MP3 file and it sounded hugely better. I didn't play with the de-esser as it didn't seem to need it--I think the harsh and tinny quality is more down to the bit rate you're using and the high levels/clipping.

Anyway, my two cents worth.

Bob
 
Could you post the original audio, before you altered it?

I can't imagine all the stuff that you're doing is necessary.
 
Thanks bob.

The original WAV from the recorder is here:

www!omilia!org/hosted_files/raw.wav

The noise floor does seem higher than it perhaps should be for a device like this. There are sample recordings of the noise floor of the device here:

www!wingfieldaudio!com/portable-recorder-noise.html
(hopefully I can post URLs by the next post..)

In any case, the recordings are finished, and I should do with them what I can.

Sorry, I think the MP3 I produced had an error. I made a script to apply all these steps, and it seems to have limited the audio twice, which is why it was clipping.

I can use 128kps. I will loose the Deesser. Any tips for the EQ?
 
The gain seems to be ok on the recording device, peaking at -5db. Listening to the samples on that Wingfield Audio page, it sounds like she was using an external dynamic mic, instead of the build in mics, which probably accounts for the noise.
 
Thanks for posting the original.

First off, I have to say I misread your original posts and thought you'd recorded the material on a Samson mic. My bad. I understand now that you're working with somebody else's mess!

Okay. There's a big fault with the original. The noise floor is at about -40dB which is about 45-50dB worse that it should be with a Zoom recorder. Even worse, there's a ripple on the noise. I've used Zoom recorders before and they tend to be much better so I think there's a fault with the recorder--if it's still under warranty, I think your friend should take it back.

That said, I think you did about as well with the NR as you can do.

Regarding EQ, I don't think it needs too much--I played with some roll off above about 8k which tames the slight sibilance a bit and may remove your need for a de-esser. As for the mids, there's a problem. I see what you mean about the slight "boxiness" but cutting in that range also affects the intelligibility which is exactly there. I had a play but, in the end, decided the best compromise was to leave it, especially on my "cheap" headphones as opposed to big studio monitors. YMMV.

I might be tempted to try some subtle work in multiband compression rather than the kludge of hard limiting but at this stage family pressures have stopped my playing!

Glad to hear you found the problem with your script...the original "final" version was very bad! After that, anything will be better.

Bob
 
I'll let the reader know that there is something possibly wrong with her mic. This recording will be good enough for the intended purpose (an accompaniment to a book for learning French), but I'll try to improve things for next time. The NR makes the sound occasionally sound hollow/artificial and looses some of the natural 'airiness.'

Thanks a lot for your advice, it's really a great help.
 
Glad I was at least a slight help. I still may have a play tomorrow if I get a chance (it's already after 1AM Friday here in Aus) just as a bit of a challenge.

I had a hunch it was a "learn French" book. My own French is years out of date and I was never fluent but I could pick up most of it. That makes the intelligibility thing of key importance over absolute sound quality.

Bob
 
Can you share as the audio recording? It's hard to advice when you're just left with descriptions.
 
And there's a link to a sample of the raw files in Post number 8.
 
Back
Top