Extracting Vocals

frederic

New member
Not sure if this is the correct section or not, but I will blast it out there anyway hoping that it is.

Due to a very long, tiring story I have a stereo MP3 where the vocals are clear (and centered), but the background music (somewhat stereo) is muddy, crummy, aggrevating, terrible, and about 20 other negatives.

To save time, I'd like to remove the vocals the best I can while chopping off as much of the music as I can.

I own Sonar 8 Producer, have Audacity at my fingertips (with the VST add-on) and most to all of my Sonar VST plug in effects are available in Audacity.

I can easily eliminate the vocals in either software, though I used Audacity because it loads faster and is simpler to use gui wise.

I just inverted "R" and play L and R at the same time and I have a mono-ish vocal-free recording. It's a little thin but I figured as a starting point that would be okay.

I converted what remained to mono.

Then, I imported the same stereo mp3, converted it to mono, then inverted the first mono track (the one without the vocals) hoping to cancel out the music leaving the vocals.

While that seemed logical, the results were... awful. I made a massive aliasing mess. I also tried playing with the EQ on the vocal-zapped mono track before inverting it, and that made things worse. I tried EQing the combination of the inverted vocal-free track with the original track converted to mono and that seemed closer.

After monkeying with this for most of today, I figured I'd ask if any of you have a reasonably decent method of capturing just the vocals, or at least "mostly" vocals and dump the rest.

Obviously there is no magic bullet as I'm removing information from the digital audio representation of the song, but I would be happy with 'reasonable'.

Any ideas? I'm open to 'em !


I also played with several free VST plug ins - they work about the same as my EQ the mess of inverted mono from stereo phase shifted vocal zapped tracks, though it was convenient just to click a button than do all these steps.

Thanks in advance. If this works out even "medium" okay, it saves me a 11 hour trip to Ohio and back.
 
When you say you've tried VST plugs, I assume you've tried mid/side processors like Voxengo's MSED? If not, I'd give that a shot first.

But as far as your manual process - if I follow you correctly - you might be getting that phasing because of the combination of all those inversions and mono conversions.

After you have removed the vocals, have you tried leaving the results in stereo, as thin as they may be? Then from there, invert what remains in the left channel so that both the L&R channel are inverted, and then try canceling that against the stereo original. They won't evenly cancel, of course; it may help to try and match levels between the two versions first, though that won't make it perfect either.

G.
 
When you say you've tried VST plugs, I assume you've tried mid/side processors like Voxengo's MSED? If not, I'd give that a shot first.

Yes, I tried that one as I have all the voxengo plugins running with Sonar and Audacity. I also tried "Karakoa" which is an older vst plugin but it actually did a slightly better job than I could get MSED to do.


But as far as your manual process - if I follow you correctly - you might be getting that phasing because of the combination of all those inversions and mono conversions.

That's what it sounds like to my ears... and before mixing to a stereo or mono mix (tried both ways) I nudged forward and back slightly several different amounts hoping to line things up, however in Audacity they seem to be lined up.

After you have removed the vocals, have you tried leaving the results in stereo, as thin as they may be? Then from there, invert what remains in the left channel so that both the L&R channel are inverted, and then try canceling that against the stereo original. They won't evenly cancel, of course; it may help to try and match levels between the two versions first, though that won't make it perfect either.

I'm not after perfection, just better than what I've done. The best way to solve this problem is have her drive out here and stand in the booth with a mic in her face.

I'll give what you suggest a try... thanks sir!
 
The best way to solve this problem is have her drive out here and stand in the booth with a mic in her face.
Amen to that! :)
I'll give what you suggest a try... thanks sir!
I make no promises, just kind of grabbing at straws there, The only driving thought there is that maybe by reducing the number of steps that it may reduce the artifacting. Worth a shot at least.

Good luck! :)

G.
 
Amen to that! :)

It's just another reminder that doing favors for friends require endless streams of punishment :)

I make no promises, just kind of grabbing at straws there, The only driving thought there is that maybe by reducing the number of steps that it may reduce the artifacting. Worth a shot at least.

Thanks for the wish of luck... I'm really not a plug-in guy and kind of learning as I go. I miss the reels, knobs and faders, I really do.

I'll keep ya posted. I have four versions of her voice only and I can't decide which one sucks less, because they all suck lol.
 
I converted what remained to mono.

Then, I imported the same stereo mp3, converted it to mono

OK, so I am having difficulty understanding this. Why are you converting it to mono?

Leave the vocal-eliminated track stereo, then mix it in with the original, except reverse the phase on the vocal-eliminated track (while keeping it stereo, so on its own it should sound really wide). In theory, this should get rid of all the "side" signal, leaving you with the center, where the vocal is.

If this doesn't do it, try to reverse the L/R sides on the vocal-less track.
 
BTW, stuff like this almost never sounds 100% satisfactory, and you may end up with some phasy issue. Although, I've used this technique on samples of stuff that I've lifted from movies and other sources, where I just wanted the dialogue for example. It still leaves some of the background noises there, as there is always something in the center, other than what you want :)
 
Extracting Vocals = "A Minus B"

Removing vocals (which are generally placed in the center of the stereo field) is easy, by doing what you did: inverting one channel (say, left), and summing the inverted channel and original opposite channel to mono. The process is also called "A Minus B". "Vocal Eliminator", too.

You're done.

No amount of re-flipping, inverting, reversing, stereo-ing, or whatever will get you any real amount closer.

And even though it seems as though you could mix and re-invert and get another level of removal happening, it just doesn't work that way. *sigh*

EQ may help, especially since the remaining de-vocalized track is now probably bass-thin, as bass also tends to be mixed to the middle, and is also now reduced/removed. So a bass boost can help the track sound more complete.

There are a few plugins that will allow for a frequency-specific inversion, allowing for the differences in people's voices, so there would be a bit less destruction to the remaining musical information, but still only just a teensy bit better...
 
Back
Top