Advantage of 24 vs 16 bit, a primer...

  • Thread starter Thread starter BigRay
  • Start date Start date
BigRay

BigRay

New member
from www.24bitfaq.org :


One of the most important concepts in recording is called dynamic range, and is measured in dB. The dynamic range of a recording is the difference between its loudest point and its quietest point.

To elaborate further, each bit gives us the ability to represent about 6dB of dynamic range. A passage that is 6dB louder than another passage is said to be twice as loud as the other passage. In the 4-bit example, we theoretically have 24dB of dynamic range that can be used. But what if recording doesn’t take advantage of all that dynamic range? What if the recording never peaks beyond 6dB of its maximum possible limit? In this case, the recording would only take advantage of 3 of what we call the least significant (or left-most) bits, meaning 18dB of dynamic range. 16-bit recordings are capable of a theoretical maximum limit of 96dB of dynamic range. This means that a single wave could have up to 65536 discrete values that can be used to represent it. But if the same wave recorded at 16-bit peaks at 48dB below its maximum possible limit, then there would only be 256 discrete values that can be used to represent it, taking advantage of only 8 of the least significant bits. The 8 most significant bits would contain no information whatsoever, and would remain unused. In the case of 24-bit recording, you’d have a maximum of 16,777,216 values to choose from, and in the case of a wave peaking at 48dB below its maximum possible limit, the wave would still have 65536 possible discrete amplitude values that could be used to represent it.

Now, have you ever heard any of the early 8-bit computer recordings that floated around in the early days of home computers? Didn’t they sound just awful? I mean, you were impressed because you had a snippet of music that you could recognize playing from your computer, but you wouldn’t want to listen to it for more than a minute or two. I personally remember playing back an 8-bit digitized 5 second snippet of Van Halen’s rendition of the Kinks’ “You Really Got Me” over and over again on my Atari 800 until I couldn’t take it anymore. The thrill soon had me building an 8-bit digitizing device with a microphone input jack and a connector for the joystick port. Ah, those were the days… but I digress.

Perhaps many are more familiar with 8-bit audio from real-time internet sources like RealAudio. It’s good enough for speech recognition, but leaves all too much to be desired for music.

Now here’s the kicker in the16-bit realm. While the volume level of a recorded low-E note struck on an acoustic guitar might take advantage all 16 available bits (for instance, where the peak on the DAT deck reaches 0dB), the squeak of the fingers on the string, the scratch of the pick hitting the string, and the 5 or 10 audible harmonic overtones of that note may never reach a point beyond 48dB shy of the 96dB maximum. Yes, all of these additional by-products of that low-E string that make the guitar sound alive and compelling receive all of the fidelity of that scratchy, distorted, computerized sound of that 8-bit sample from long ago. And as the basic low-E note fades out, it too gets the same butcher treatment from the ever decreasing number of discrete amplitude values. Yikes!

Now record with a 24-bit word length, and put the CD quality back into those string squeaks, pick scratches, and overtones. With 24 bits, you can hear the clarity of the cymbals decaying as they keep ringing smoothly down to complete silence. The little low-level smack of the bass pedal head hitting the bass drum skin that sounded barely like a small click before (if audible at all) now sounds like a smack, complete with its own smoothly reverberating decay. Even the low-level acoustical reflection from the wall behind the band now contributes to the experience with added detail and a sense of ambience, not simply low-level distortion. Finally, because of this improvement, no more does the recordist have to risk overloading and clipping the recording in effort to achieve maximum fidelity. Levels can be set conservatively with the assurance that a high degree of fidelity is maintained.
 
Cross-posting is a no-no regardless of the word length.

G.
 
I am well aware of that, thanks.

When I posted it the first time, I got an error ....and so I thought it didnt post.


Believe you me, I am well aware of the regulations governing internet forums(they are mostly the same, the basics anyway), and any mistake I make will be just that, a mistake, not intentional.

-teddy
 
No prob, TR.

Now maybe you can answer a question that has bugged me all along. Why do they choose to limit resolution to 6dB/bit? Seems to me that when we get above 24-bit, dynamic range is no longer as much of an issue as resolution is. I'd rather have 90dB of range at 3dB resolution than 180dB of range at 6dB resolution.

G.
 
people claim that the dynamic range of a digital device is 6db per bit...
In digital audio, each bit offered by the system doubles the (voltage) resolution, corresponding to a 6 dB ratio. For instance, a 16-bit (linear) audio format offers a theoretical maximum of (16 x 6) = 96 dB, meaning that the maximum signal (see 0dBFS, above) is 96 dB above the quantization noise.

--d.o.


SouthSIDE Glen said:
No prob, TR.

Now maybe you can answer a question that has bugged me all along. Why do they choose to limit resolution to 6dB/bit? Seems to me that when we get above 24-bit, dynamic range is no longer as much of an issue as resolution is. I'd rather have 90dB of range at 3dB resolution than 180dB of range at 6dB resolution.

G.
 
BigRay said:
people claim that the dynamic range of a digital device is 6db per bit...
In digital audio, each bit offered by the system doubles the (voltage) resolution, corresponding to a 6 dB ratio. For instance, a 16-bit (linear) audio format offers a theoretical maximum of (16 x 6) = 96 dB, meaning that the maximum signal (see 0dBFS, above) is 96 dB above the quantization noise.

--d.o.
I understand all that. What I'm saying is that by the same math you could devise, for example, a system that's 32-bit, but the voltage differential per bit is halved, and would represent a 3dB change. Then you'd have 32*3, or 96dB of range, just as with the current 16-bit formula before, but you'd have double the resolution in defining the amplitude of the sample.

I'll take a floor of -96dB with 3dB resolution over a floor of -192dB at 6dB resolution any day.

G.
 
You are exactly right, Glenn. Someone a long time ago implemented these numbers and made them a standard, and thats what we are dealing with. We need forward thinkers to revise the old habits.

Teddy
 
I think what you are talking about is called "non-linear quantization". I typed it into google and got lots of results.

It looks like it was used on 8 bit telephone audio encoding among other applications. I didn't have time to read about the pro's and con's though...

SouthSIDE Glen said:
I understand all that. What I'm saying is that by the same math you could devise, for example, a system that's 32-bit, but the voltage differential per bit is halved, and would represent a 3dB change. Then you'd have 32*3, or 96dB of range, just as with the current 16-bit formula before, but you'd have double the resolution in defining the amplitude of the sample.

I'll take a floor of -96dB with 3dB resolution over a floor of -192dB at 6dB resolution any day.

G.
 
The 6db "rule of thumb" CANNOT be thought of as a "resolution increase" in the traditional intuitive sense, but rather the amount of dynamic range increased.
The resolution throughout the entire range increases when more bits are used. The dynamic range increase is actually just a byproduct of the resolution.

The number of discrete volume "steps" of a 16-bit system available is 65,536.
The number in a 24-bit system is 16,777,200.

If we really only had 6db amplitude resolution, you wouldn't hear any difference between a 0dbFS sound and a -5dbFS sound. Yeah, right. That would be useless for audio.

In reality, even a 16-bit system has more than enough resolution to 100% accurately recreate any waveform so long as it is sufficiently loud.
"Sufficiently loud" is the real kicker though, as resolution decreases as volume does as well.

The limit of a digital system's dynamic range is actually the result of resolution limitations, not the other way around. You have to think of it like this.
Basically, in 16-bit audio, -96db is just the point where the resolution is so low that the level of error caused by the detected level "jumping" back and forth between the lowest bit (quantization error) will sound like a louder noise than the actual sound wanting to be reproduced.
 
bleyrad said:
Basically, in 16-bit audio, -96db is just the point where the resolution is so low that the level of error caused by the detected level "jumping" back and forth between the lowest bit (quantization error) will sound like a louder noise than the actual sound wanting to be reproduced.
Ah, I *think* I might understand what you're saying. Let me try to rephrase it a bit and let me know if I have it right...

In 16-bit, -96dB is, in a fashon, actually the resolution; you have 64K steps of delta -96dB per step?

Is it as simple as a flat linear proportion of fixed volume per bit as implied above or is there a more algebraic relationship between volume and bit depending on bit position in the word?

G.
 
It is not a linear resolution scale because of the number of bits used to describe a particular sample.
I haven't formally studied digital theory so I am struggling to understand the reason behind this. The way I see it, though, in 24-bit words, the 16 bits representing the "bottom half" of the signal only still has 64K resolution. So in other words, lower than this halfway point, you only have 16-bit resolution and less - just at a lower level. As the volume increases from this point, for each 6db you get an additional bit of resolution. So at 6db higher than the halfway point, you have a 17 bits with which to describe the level - 131072 total points. So within the 6db range between 16 and 17 bits you have 65,536 discrete levels with which to map the sample. That's a ~0.00009 db resolution level. And in the 6db range between 23 and 24 bits (actually called bits 2 and 1 I think, just simplifying for math's sake) you have 16776671 discrete levels - a resolution of .00000036db. This is your resolution between 0dbFS and -6dbFS.
Thus as the level decreases beyond -6dbFS, so does the resolution. This is the key point regarding dynamic range. Once you reach -144dbFS, the resolution is exactly zero/infinity - and the quantization error noises of attempting switch on/off the bottom bit while the signal is around this level (giving 6db resolution I guess) is so erroneous and "rounded" so much that the signal is no longer louder than the error.
Of course in reality this level cannot be reached due to normal electrical noise. So in a typical 24-bit system we have around 21-bits of dynamic range but with the resolution at the top of 24-bits.
This is just using my really rusty math skills and a few assumptions. Someone please correct me if I screwed up somewhere.
 
Last edited:
I haven't checked your math, but it does sound as though it might be a logarithmic relationship, or something close to it.

Nevertheless, I finally have a rough handle on the proper way to consider it, which until now I have failed to grasp properly. You, sir, are the first source of many I have asked, read, searched, etc. who has been able to describe the right perspective for me. I thank you greatly! :)

G.
 
SouthSIDE Glen said:
I haven't checked your math, but it does sound as though it might be a logarithmic relationship, or something close to it.


I think 2^n-1 works for the most part, where 2= the number of possible values for a binary digit, n= wordlength and -1 means that no signal doesn't count. Makes it more of an exponential binary relationship.

I have no idea what this means, but it looks cool:


16 bit:

0 to -6 = 65,536
-6 to -12 = 32,768
-12 to -18 = 16,384
-18 to -24 = 8,192
-24 to -30 = 4,096
-30 to -36 = 2,048
-36 to -42 = 1,024
-42 to -48 = 512
-48 to -54 = 256
-54 to -60 = 128
-60 to -66 = 64
-66 to -72 = 32
-72 to -78 = 16
-78 to -84 = 8
-84 to -90 = 4
-90 to -96 = 2


24 bit:


0 to -6 = 16,777,216
-6 to -12 = 8,388,608
-12 to -18 = 4,194,304
-18 to -24 = 2,097,152
-24 to -30 = 1,048,576
-30 to -36 = 524,288
-36 to -42 = 262,144
-42 to -48 = 131,072
-48 to -54 = 65,536
-54 to -60 = 32,768
-60 to -66 = 16,384
-66 to -72 = 8,192
-72 to -78 = 4,096
-78 to -84 = 2,048
-84 to -90 = 1,024
-90 to -96 = 512
-96 to -102 = 256
-102 to -108 = 128
-108 to -114 = 64
-114 to -120 = 32
-120 to -126 = 16
-126 to -132 = 8
-132 to -138 = 4
-138 to -144 = 2



sl
 
I don't think this is correct. 65,536 represents the TOTAL discrete steps in a 16-bit word available between 0 and -96db... not 0 and -6db.

2^n is how you figure out this total resolution for any given bit-depth. This is how I worked out all the numbers for my last post.

I could figure out the exact relationship and answer to this question by graphing the results one bit at a time, but that would take a bit of academic effort, and frankly the reason I'm taking time off school is so I don't have to do that sort of crap.


I believe that using a floating-point system should mean that the resolution is linearized and all bits are available for describing all levels. So in 32-bit float, the level could be nearly infinitely low or high and you'd still have a full 24-bit resolution available. This is just theorizing though, I haven't looked at exactly how floats work. And of course it would be useless while tracking because the converter is only going to give you its non-linear resolution in the first place.
Why not make a converter with linear resolution though, which outputs floating-point natively? :confused: I'm guessing it has something to do with the oversampling architecture used to achieve normal performance levels in the first place.. but theoretically, shouldn't it be possible if you don't do oversampling?
 
Last edited:
bleyrad said:
I don't think this is correct. 65,536 represents the TOTAL discrete steps in a 16-bit word available between 0 and -96db... not 0 and -6db.

Actually, it's just a chart that represents that the range between 0 and -6 is using n bits (although it doesn't say that), which then shows what 2^n is at that point. You're right, at each step it's showing the total range from that point to the LSB. Total range up to that point. I'm just guessing, of course. I authored the charts, but as I've said, I don't know what they mean. Except 2^n, labeled with dynamic ranges instead of numbers of bits used.

I'm guessing that if you wanted to know the actual number of steps a waveform could take specifically in the range of 0 to -6 at 16 bits, you'd start with 65,536 and divide it in half to get rid of the range underneath it. Now of the 32,768 steps remaining, these are going to be divided again as positive and negative values in the waveform. Consider 16,384 steps or thereabouts in that range. Something like that anyway.

What the charts do show is that the resolution doubles with each added bit. They also show that the resolution drops off rather quickly. Same thing, really.


I'm guessing that this only applies to PCM audio. In a theoretical one-bit PCM system, you'd be able to record a range of frequencies, and they'd all come out at the same dynamic level - on or off. Sound or no sound. One level fits all.

So I'm wondering how dynamics are calculated with DSD audio. (Which has nothing to do with this thread, really...)


Anyway,


sl
 
I have a feeling there is an additional variable at play here such that all our talk of "resolution" is essentially meaningless.
Thinking about this in terms of actual Nyquist waveform theory, the actual recreated-waveform ouput volume resolution coming from the DAC should be infinite and analog in nature, limited only by the dynamic range.
This is because we are not dealing with each bit as a seperate impulse, but all mere mathematical descriptions of a continuous waveform with certain conditions assumed. The main one being that there should be no frequency above Nyquist. So any error caused by rounding a bit to the next discrete step might cause either a) too sharp a transition, in which case it will be corrected in the lowpass filter, and b) too slow a transition, in which case it would be incompatable with the surrounding bits which describe the rest of the waveform, and the waveform plotted in the correct place.

So I think resolution is only really important when you start dealing with audible quantization error - signals which decay into or near the LSB.
 
bleyrad said:
I have a feeling there is an additional variable at play here such that all our talk of "resolution" is essentially meaningless.

....

So I think resolution is only really important when you start dealing with audible quantization error - signals which decay into or near the LSB.


its not really linear since the number of bits to describe the value are higher with larger word sizes than smaller word sizes. for example, if our value is 24 then the number of bits involved is 5:

11000 = 24 (out of a maximum of 31)

but if our value is 2,500,345 then we have 22 bits:

1001100010011011111001

with a large corresponding number of bits to tweak the value. for instance if we go up or down 20, the larger number is significantly less impacted than the smaller number - hence the "resolution" on the larger number if better.

this is the key reason why digital audio needs higher levels to achieve the same clarity as analog because the lower the values sit in the word, the less resolution they're capable of capturing. so have 24 bits is important because we can put the higher levels into the higher word size and the lower end of the word is effectively down into inaudible ranges. with 16 bit we have to really cram everything we can into those top bits or risk less resolution.
and if we take low level 16 bit and try to raise it, guess what? we're effectively raising the distortion level because we're taking less resolution data and moving up into a more audible region.

HTH
 
bleyrad said:
...the actual recreated-waveform ouput volume resolution coming from the DAC should be infinite and analog in nature, limited only by the dynamic range.
After conversion the waveform is analog, but it's still a synthesized waveform cooked from the recipe given by the digital data, which by it's very definition has an inherant percentage of error built into it because the digital words are of finite length whereas the actual original analog values are (except for some very special conditions) infinite in precision. The length of the digital value defines the dynamic range, and the resolution is derived from that.

There will be some "smoothing" - a form of noise - added by the converters, not to mention plenty of extrapolation for the infinite number of analog points in-between the digital samples, and you may be right that this all happens at a coarser level than the theoretical resolution of the digital system. What that's really saying is the noise level of the converters is higher than the "floor" of the digital information itself; i.e. that the converters are adding noise to the signal and reducing the total theoretical dynamic range.

This will be true regardless of the bit length. 64-bit converters will add noise to 64-bit data the same way as they do at 16- and 32-bit (I'm ignoring technical tricks here and just talking basic theory.) However, 64-bit will *still* provide finer resolution - and therefore more accurate reproduction - than lower bit lengths, regardless of any smearing that happens in the conversion process.

So I think it's a mis-step to say that the conversion process renders the question of resolution meaningless. Perhaps it would be a bit more accurate to say that the conversion process just renders the calcualted resolution/dynamic range "theoretical". Those theoretical values are still meaningful.

G.
 
snow lizard said:
I'm guessing that this only applies to PCM audio. In a theoretical one-bit PCM system, you'd be able to record a range of frequencies, and they'd all come out at the same dynamic level - on or off. Sound or no sound. One level fits all.

Don't laugh--that's how sound was synthesized in the very early days of computing!

So I'm wondering how dynamics are calculated with DSD audio. (Which has nothing to do with this thread, really...)

It's a different principle--each subsequent bit changes the previous voltage level. So whereas in PCM you know that a discrete sample is a given voltage, with DSD you don't really know much at all without referencing the surrounding bits.
 
Back
Top