masteringhouse said:
G,
Do you have any further info on this? I've heard of bitmap proxies in regard to image data, but not audio.
Common examples of proxies in audio are when you import an MP3 into a standard WAV editor. For example, open an MP3 in Sound Forge. It'll pop up a dialog box stating it's creating a proxy for the MP3 before the final waveform display appears for editing. On faster CPUs this popup may be only momentary and perhaps even so fast as to be un-readable. (And in fact I'm referring to version 6, which is what I still use; it's possible they may have even removed this popup in later versions, I don't know. But I'm sure the underlying process is basically the same.)
The central issue is that MP3 is a format based upon a lossy codec. Codec meaning encoder/decoder. MP3 is to WAV audio what JPG is to BMP or TIFF graphics in this regard (It's also like what MPEG2 is to AVI in video; in fact MP3 is actually shorthand slang for "MPEG1, Layer 3"). An integral part of the MP3 encoding is the shortening of the file by, when possible, replacing long strings of bits with shorthand (and sometimes only approximate) mathematical descriptions of those bits. It's kind of like replacing the string "0000000000000000" with the string "0x16". Instead of taking 16 characters to describe the string of 16 zeros, the length of that particular string has been shortened by a factor of four to four characters by "encoding" that information using an agreed upon code.
A more lossy form might be replacing all the bits it would take to describe the upslope of the initial attack of a wave form 10 samples long - which uncompressed would require a full 16-bit description of every sample in the slope - with a string that gave the initial amplitude, an algebraic description of the curve of the slope between the two (e.g. "y=2x+3") and the length of the slope in number of samples (in this case, the number 10). In such a case the bitmap description of the slope that took 160 characters (16 bits times 10 samples), has been reduced to somewhere around 30 characters (when you include stop bits and such)
These are a gross oversimplifications of what the MP3 codec is actually doing, of course, and that is not technically exactly how it works (we're ignoring bits vs. bytes, for example), but it's a fair demonstration of the general concept of how mathematical shorthand and approximation are used to compress bitmapped information (audio, video, or graphic) into smaller sizes via lossy compression. It's not totally unlike a stenographer using shorthand notation to record actual uncompressed conversation.
But what happens when we want to edit the encoded information? What happens if "0000000000000000" needs to be changed to "0000000000001000". It's not a simple matter of just going in and changing the "spelling" of the phrase "0x16". That phrase is now completly inadequate for describing the uncoded information, because there is no longer a string of sixteen identical characters. It needs to, at some level within the software, decoded so that the new "1" can be added, and then re-encoded to something like "0x12[stop bit]1[stop bit]0x3".
To the best of my understanding, even those packages that tout themselves as "MP3 editors" have to decode (or translate, if you will) at least part of the file back to individual bitmap so that editing can actually be done at the bit level. Sound Forge and "standard" editors that let you drag MP3s in for edting create bitmap proxies on which to perform the actual editing. "MP3 editors" have to do the sam to at least part of the file, if not all of it.
It's like when you go into Photoshop and edit a compressed JPG file. The picture you see on your monitor is actually a pixel-for-pixel representation of how the JPG decodes. It's a bitmap proxy of the JPG information. It has to be for you to be able to do pixel-level editing of the image. Then when you save your edit, the visible proxy is freshly encoded as a new version of the JPG file.
G.