There's a lot of talk these days about the benefits of 24 bit/96 kHz recording over the old CD standard. Often there is a lack of technical knowledge and instead you are fed with final words like "There's a significant audible difference..." or "There's no audible difference whatsoever..."
Claimed benefits of recording at a greater bit depth are:
- Better dynamics/headroom - you can afford to lose a few bits here and there due to low levels and still have at least 16 used bits in all tracks when it's time for mixdown.
- Plugins who can deal internally with higher bit rates sounds better when fed with 24 bits than 16.
- A multitude of audio tracks mixed down to a final stereo track would present a higher demand in digital representation than just a few tracks. The 16 bit/44 kHz format is not sufficient to hold all this data with accuracy.
I feel comfortable with the first and second claim, but the third one I find questionable. Is it really true that it requires more digital "resources" to represent the sound of a full symphony orchestra than it requires to represent a simple sine wave? And if so, is the CD standard format not sufficient to deal with accurate representation of very complicated wave forms (like the symphony orchestra or a multitude of mixed down audio tracks)? Is there a real and for every added track increasing degradation in sound when mixing down several 16/44.1 tracks to a single 16/44.1 master? Or is this just plain nonsense? If this is true it also implies that you would benefit from mixing down a multitude of 16/44.1 tracks to a 24/96 (or at least a 24/44.1) master which would more accurately be able to hold all data from the tracks added together, provided all 24 bits are used and not just the lowest 16 bits.
To elaborate here, mixing down a bunch of 16 bit tracks within the same system (i.e. Cubase Export Audio) to a final 24 bit mix with everything at unity (master set not to go above 0 dB) would result in a 24 bit file with only the bottom 16 bits used. But if you calculate the added headroom of these 8 unused bits, you should be able to increase the levels on the master faders n dB above 0 dB (calibrated for 16 bits) and thus get a 24 bit mixdown with all bits used. Mixing down from 16 bits doesn't equal mixing down 1x16 bits but Nx16 bits, where N is the number of tracks. When mixing down to 24 bits you can allow more of these Nx16 bits to be transfered to the mix, to put it simply. In real life the equation gets a bit more complicated if we consider that some tracks are stereo and some are mono and the final mix is stereo. Then Nx16 then corresponds to the total number of mono tracks (where each stereo track is regarded as two mono tracks). The resulting mix would have 2x24 bits capacity.
My guess is that audio quality is preserved no matter how many tracks are mixed down, and this is due to the fact that every track is mixed down with only a fraction of its original amplitude. Everyone knows that the more tracks you have in a mix, the more you have to back off on the faders in order not to get a total signal above 0 dB and introduce digital clipping (provided the individual tracks are recorded at or near 0 dB level). The amplitudes for all tracks are added together and must be reduced to fit the headroom in the final mix. Less amplitude/volume means less need of number of bits in the representation, thus no significant degradation in sound.
The benefits from an increase in sample rate is more questionable. According to Nyquist 44.1 kHz can accurately reproduce frequencies up to 22.05 kHz, which is above most peoples (and certainly most ear abused musicians) hearing limit. Sufficient oversampling and good AD/DA converters is of course a must. What would be the benefits of increasing the sample rate to 96 kHz? Do you get a flatter frequency response and less distortion at the highest audible frequencies? Is the Nyquist theorem just theory and is 44.1 kHz sample rate when it comes down to dust really not enough to accurately reproduce all audible frequencies? Where does for example oversampling come into the picture or statistical quantification errors, the latter suggesting a greater accuracy for higher sampling frequencies than 44.1 kHz?
These are, I think, important questions for anyone involved in digital recording. Any facts beyond "I hear the difference" are welcome. Hints of good literature (books, web sites) that deal with these questions would also be appreciated.
/Mats D, Sweden
Claimed benefits of recording at a greater bit depth are:
- Better dynamics/headroom - you can afford to lose a few bits here and there due to low levels and still have at least 16 used bits in all tracks when it's time for mixdown.
- Plugins who can deal internally with higher bit rates sounds better when fed with 24 bits than 16.
- A multitude of audio tracks mixed down to a final stereo track would present a higher demand in digital representation than just a few tracks. The 16 bit/44 kHz format is not sufficient to hold all this data with accuracy.
I feel comfortable with the first and second claim, but the third one I find questionable. Is it really true that it requires more digital "resources" to represent the sound of a full symphony orchestra than it requires to represent a simple sine wave? And if so, is the CD standard format not sufficient to deal with accurate representation of very complicated wave forms (like the symphony orchestra or a multitude of mixed down audio tracks)? Is there a real and for every added track increasing degradation in sound when mixing down several 16/44.1 tracks to a single 16/44.1 master? Or is this just plain nonsense? If this is true it also implies that you would benefit from mixing down a multitude of 16/44.1 tracks to a 24/96 (or at least a 24/44.1) master which would more accurately be able to hold all data from the tracks added together, provided all 24 bits are used and not just the lowest 16 bits.
To elaborate here, mixing down a bunch of 16 bit tracks within the same system (i.e. Cubase Export Audio) to a final 24 bit mix with everything at unity (master set not to go above 0 dB) would result in a 24 bit file with only the bottom 16 bits used. But if you calculate the added headroom of these 8 unused bits, you should be able to increase the levels on the master faders n dB above 0 dB (calibrated for 16 bits) and thus get a 24 bit mixdown with all bits used. Mixing down from 16 bits doesn't equal mixing down 1x16 bits but Nx16 bits, where N is the number of tracks. When mixing down to 24 bits you can allow more of these Nx16 bits to be transfered to the mix, to put it simply. In real life the equation gets a bit more complicated if we consider that some tracks are stereo and some are mono and the final mix is stereo. Then Nx16 then corresponds to the total number of mono tracks (where each stereo track is regarded as two mono tracks). The resulting mix would have 2x24 bits capacity.
My guess is that audio quality is preserved no matter how many tracks are mixed down, and this is due to the fact that every track is mixed down with only a fraction of its original amplitude. Everyone knows that the more tracks you have in a mix, the more you have to back off on the faders in order not to get a total signal above 0 dB and introduce digital clipping (provided the individual tracks are recorded at or near 0 dB level). The amplitudes for all tracks are added together and must be reduced to fit the headroom in the final mix. Less amplitude/volume means less need of number of bits in the representation, thus no significant degradation in sound.
The benefits from an increase in sample rate is more questionable. According to Nyquist 44.1 kHz can accurately reproduce frequencies up to 22.05 kHz, which is above most peoples (and certainly most ear abused musicians) hearing limit. Sufficient oversampling and good AD/DA converters is of course a must. What would be the benefits of increasing the sample rate to 96 kHz? Do you get a flatter frequency response and less distortion at the highest audible frequencies? Is the Nyquist theorem just theory and is 44.1 kHz sample rate when it comes down to dust really not enough to accurately reproduce all audible frequencies? Where does for example oversampling come into the picture or statistical quantification errors, the latter suggesting a greater accuracy for higher sampling frequencies than 44.1 kHz?
These are, I think, important questions for anyone involved in digital recording. Any facts beyond "I hear the difference" are welcome. Hints of good literature (books, web sites) that deal with these questions would also be appreciated.
/Mats D, Sweden