Anyone need an old multitrack separated?

  • Thread starter Thread starter boblybob
  • Start date Start date
Alright, I realized I won't be able to separate anything better than I already have.

I'll separate the first 30 secs of the song you gave me to prove I'm not using and Rock Band or Guitar Hero tracks.

Just don't expect anything better then what I've shown you.
 
How can I create instrumental version(without the vocals) of the ff songs:

All My Loving
Strawberry Fields Forever
With a Little Help From My Friends
Helter Skelter

ps: preferably the movie version from across the universe
 
Alright, I realized I won't be able to separate anything better than I already have.

I'll separate the first 30 secs of the song you gave me to prove I'm not using and Rock Band or Guitar Hero tracks.

Just don't expect anything better then what I've shown you.
You don't have to bother now. What's important is that you (and all newbs reading this) now realize that what you originally claimed cannot be done. That's all we needed to really hear.

And especially since you'll be deleting that illegal software from your computer(s) now, you don't have to try and prove anything else with it.

G.
 
You don't have to bother now. What's important is that you (and all newbs reading this) now realize that what you originally claimed cannot be done.


Having turned this over in my brain a few times, I'm beginning to think maybe that this is in fact theoretically impossible (as in, according to modern theory, it can't be done).

To sum up the idea, it's based on the principle that any audible signal can be effectively considered to be a sum of pure sinusoid signals in varying proportions. Combining two signals (like a mixed stereo file) is simply further summing of sinusoid signals (correct me if I've got this wrong).

Based on that, a combination of signals "doesn't care" where any given sinusoid amplitude originated in the original signals. It only "cares" about the total description of the combined sinusoids.

This is why filtering a signal to remove a noise is often difficult: when you drop the appropriate frequencies of the signal, you can't specify to only remove the gain values that "belong" to the unwanted signal. The mixed signal doesn't care about that distinction. So filtering at a particular frequency to remove noise doesn't really remove noise, it just makes that entire sinusoid component less loud.

It might seem like this isn't the case because you can do things like inverse summation for null testing. However, in that case, the filtering is trivial because you know exactly which frequency elements to attenuate and by what amplitude to perform said attenuation. This is why when you "null out" a signal, you don't hear artifacts of it: the signal doesn't actually "care" where the original summation came from, and nullifying simply removes the elements of that signal. In the real world, you need to have a complete description of the individual elements of the mixed signal to separate a stereo file in the way boblybob is proposing. But if you've got that complete picture, you don't need to separate because that means you've already got the individual elements (at least as they're presented in the stereo field).

Does any of that makes sense? Am I way off-base on that?
 
Moseph, that was an excellent explanation, and I think pretty much hits the nail on the head. Very nice! :)

When you sum tracks together, you are combining multiple waveforms into a single waveform. You can't - with today's consumer technology, anyway - perfectly "deconstruct" that waveform back into it's constituent "parts" through simple waveform frequency/amplitude editing. That's like trying to separate water back into hydrogen and oxygen by running it through physical filters.

Now, that's not to say that perhaps in the future (or maybe now in some advanced computer lab somewhere) one couldn't design some "intelligent" software that could analyze the content and determine that this is saxophone and that is piano, and reconstruct or synthesize separate tracks based upon it's own sample/algorithm library, but that's a different animal altogether from waveform editing.

G.
 
Having turned this over in my brain a few times, I'm beginning to think maybe that this is in fact theoretically impossible (as in, according to modern theory, it can't be done).

Change "current" to modern and I'd agree with that statement. The only reason I won't agree categorically is the human ear actually does a fairly respectable job of doing this on its own - not necessarily "soloing" an instrument in a mix, exactly, but latching onto, say, just the lead guitar, or just the vocal, and "identifying" it in a song through a haze of other competing frequencies.

I don't know if technology will ever get us to the point where this becomes possible, since this is one of those textbook examples of something that comes pretty easily to the human mind and is almost impossible for a computer, but stranger things have probably happened.
 
...not necessarily "soloing" an instrument in a mix, exactly, but latching onto, say, just the lead guitar, or just the vocal, and "identifying" it in a song through a haze of other competing frequencies.

But that's kinda what this software is already doing...just "latching" onto the core of a sound/instrument, but never able to fully extract it complete without affecting anything else.
I think our ears are no better...they can focus on one element but not extract it 100%.

I'm bored....

You started off the thread by saying you are bored and that's why you're messing around with this software.
That's fine...but I'm wondering why you don't quench your boredom with some actual recording/mixing...instead of doing this (which isn't working out very well, as it's been demonstrated)?
 
But that's kinda what this software is already doing...just "latching" onto the core of a sound/instrument, but never able to fully extract it complete without affecting anything else.
I think our ears are no better...they can focus on one element but not extract it 100%.

I didn't listen to boblybob's clips, but I can't help feeling like our ears are a LOT better at it, though.
 
Well...at least they can hear everything that's wrong with those clips! :D
 
Moseph, that was an excellent explanation, and I think pretty much hits the nail on the head. Very nice! :)

Thanks. When I was writing it I felt like I was using too much jargon, but at least it made sense to somebody.


When you sum tracks together, you are combining multiple waveforms into a single waveform. You can't - with today's consumer technology, anyway - perfectly "deconstruct" that waveform back into it's constituent "parts" through simple waveform frequency/amplitude editing. That's like trying to separate water back into hydrogen and oxygen by running it through physical filters.

Now, that's not to say that perhaps in the future (or maybe now in some advanced computer lab somewhere) one couldn't design some "intelligent" software that could analyze the content and determine that this is saxophone and that is piano, and reconstruct or synthesize separate tracks based upon it's own sample/algorithm library, but that's a different animal altogether from waveform editing.

G.


Having thought about this a little more, it does occur to me that you could conceivably do this in a "brute force" method. Something akin to adaptive learning techniques, where a computer spits a group of filtered results and a human (or humans) ranks them by how good the results are, and then using those assessments the computer makes another set of results, which is then ranked again, and you do this until a human controller decides that the separation is sufficiently clean. Sort of a way to force the results to "evolve" into something worthwhile.

The real problems I see with this type of approach are:

(01) This is an insanely tedious process. Even in a band-limited signal with 2 tracks to separate, you have effectively infinite filtering possibilities when you consider bandwidth and gain granularity for each filter at a given center frequency. Even with a building full of machines and a full-time staff working round the clock, decent results could take who-knows-how-long.

(02) This couldn't ever be automated because the machine has no particular way of determining how many separated tracks are appropriate. A human would need to make a judgment about if this track has voice, guitars, drums, etc. And also how many of those instruments might have multiple parts (two guitars).

This also assumes that the rough ordering of ranked results is about the same between humans in a group, and also that the output doesn't need to account for stereo field position or multi-source signals (i.e., a stereo pair of mics on a single instrument).
 
can u seperate this track for me

MAino -Hi Hater.

im doin a comedy video & i want the part were he say HI HATER
 
The only reason I won't agree categorically is the human ear actually does a fairly respectable job of doing this on its own - not necessarily "soloing" an instrument in a mix, exactly, but latching onto, say, just the lead guitar, or just the vocal, and "identifying" it in a song through a haze of other competing frequencies.
I think much of that part of it could be done by computer, if not today, next year or soon enough. Maybe not yet quite as well as the human ear quite yet, but I envision that part of it as potentially possible.

But that's just the first half of it. Where the human ear really excels is in automatically filling in the blanks. Gekko Zed alluded to part of this, but within a different context and subject, just yesterday when we were talking with Z3No about monitor quality and selection. In a way, our minds actually can interpret what might not actually be quite audible because of masking, and cause us to "hear" something we might not actually be hearing in the nominal sense of that word. It's kind of like the way our brain makes us think we're seeing smooth motion when we're watching a movie instead of individual still frames played in rapid succession.

Getting the computer to recognize guitar is one thing - and still not easy - but then it has to have the "intelligence" to be able to fill in the blanks caused by waveform masking, by synthesizing or reconstructing those parts that cannot physically be separated from the other instruments, and do so in a way that doesn't sound patched or synthesized.

The more I think about all that, the more I think that is not all that unreasonable of a task to expect from the best of today's computer technology. Very difficult to pull off, for sure, with a lot of programming and a lot of testing and refinement required, and maybe just slightly out of today's reach (or maybe not?!); but when I consider what products like Melodyne, Shazam, DigiTech's musIQ and even the current state of voice recognition can accomplish, I think many of the basic building blocks for detecting instruments in a mix may already be halfway there.

G.
 
Back
Top