The 24-bit challenge

  • Thread starter Thread starter Ethan Winer
  • Start date Start date
you said, "imagine that Teach was able to eliminate the 9, 11, and 13 bit files and had just two files from which to guess which one was the 16-bit dithered file. Then two correct guesses yields only (0.5)^2 = 0.25. We're back to the situation where 5 correct guesses out of 5 are needed to get > 95% confidence that the results were not by chance."

With 3 files eliminated and 2 remaining, there should be a 0.5 probability, no? I don't believe we're assuming file replacement, are we?

There is a 0.5 probability that each guess yields a correct answer. If one guesses 5 times, then there is a 0.5^5 = 0.03 probability of getting all those correct by chance.

ff123
 
Ethan Winer said:
camn,

> Where do you get the idea that 6db = 1 bit?? <

And where did you get the idea that 6 dB. is NOT one bit? :)

> If I drop volume be 18db ... Im filling those last 3 bits with ZEROs. Which is MUCH different then chopping the bits. <

--Ethan

Please note that a 6 db drop is 1/2 of the voltage into the 24 bit analog-to-digital converter (ADC).

db = 20 log (V1/V2)

This means that 1/2 of the 24 bits go to zero and therefore a 6db drop affects 12 bits not 1 bit.
 
guhlenn said:
Please let me participate

camn,


>No, it's exactly the same. Since audio programs won't play a 13-bit file, the only recourse is to leave the file at 16 bits but zero out the lower bits. Which is just what I did.

> because when I bring the volume back up... the sample can use those bits again!! they are STILL THERE!!! <

>Yes, they are still there, but they're all zeros! To take this to the extreme: Suppose I drop the volume by whatever it takes so only one bit is active, and then raise it back up. What do you think it will sound like? Do you see the point now?


--Ethan

guhlenn

When you are digitally recording multiple tracks of audio, you will have more accuracy in the mixing process with 24 bits. In the end you mix down to 16 bits in stereo for the final cut but prior to that you benefit with 24 bits.
 
When you are digitally recording multiple tracks of audio, you will have more accuracy in the mixing process with 24 bits. In the end you mix down to 16 bits in stereo for the final cut but prior to that you benefit with 24 bits.

Most audio software mixes internally at 32bit. If you use 16bit audio files, you still benefit from this.
 
I still don't have time to do a proper writeup; however, the appropriate way to analyze this data is to organize the scores into a ranking system. The raw data can be ranked as follows. Listener 11 submitted two scores, so I numbered his data 11a and 11b. The blank lines are for people who either didn't rank the files or who submitted ambiguous rankings (there is one of the latter type). The best file is given a rank of 5 and the worst is given a rank of 1. In case of ties, the ranks are averaged.

Code:
	File1	File2	File3	File4	File5
1	3.5	5.0	3.5	2.0	1.0
2	3.5	5.0	3.5	2.0	1.0
3	5.0	4.0	3.0	2.0	1.0
4	2.0	4.0	4.0	4.0	1.0
5	4.0	4.0	4.0	2.0	1.0
6	3.0	5.0	4.0	2.0	1.0
7	3.0	5.0	4.0	2.0	1.0
8	3.0	2.0	1.0	4.0	5.0
9	4.0	3.0	1.0	2.0	5.0
10	2.0	3.0	5.0	4.0	1.0
11a	2.0	5.0	4.0	3.0	1.0
11b	3.0	4.0	5.0	2.0	1.0
12	4.0	5.0	3.0	2.0	1.0
13	4.0	4.0	1.5	4.0	1.5
14					
15	5.0	4.0	2.0	3.0	1.0
16	5.0	3.0	2.0	4.0	1.0
17	2.5	2.5	2.5	5.0	2.5
18	3.0	3.0	3.0	3.0	3.0
19					
20					
21	3.0	5.0	2.0	1.0	4.0
22	1.0	3.5	3.5	3.5	3.5
23					
24	3.0	4.0	5.0	2.0	1.0
25	4.0	4.0	4.0	2.0	1.0

After eliminating the blank scores and removing the listener column, one gets the following, which can be plugged directly into my web calculator at http://ff123.net/friedman/stats.html

Code:
File1	File2	File3	File4	File5
3.5	5.0	3.5	2.0	1.0
3.5	5.0	3.5	2.0	1.0
5.0	4.0	3.0	2.0	1.0
2.0	4.0	4.0	4.0	1.0
4.0	4.0	4.0	2.0	1.0
3.0	5.0	4.0	2.0	1.0
3.0	5.0	4.0	2.0	1.0
3.0	2.0	1.0	4.0	5.0
4.0	3.0	1.0	2.0	5.0
2.0	3.0	5.0	4.0	1.0
2.0	5.0	4.0	3.0	1.0
3.0	4.0	5.0	2.0	1.0
4.0	5.0	3.0	2.0	1.0
4.0	4.0	1.5	4.0	1.5
5.0	4.0	2.0	3.0	1.0
5.0	3.0	2.0	4.0	1.0
2.5	2.5	2.5	5.0	2.5
3.0	3.0	3.0	3.0	3.0
3.0	5.0	2.0	1.0	4.0
1.0	3.5	3.5	3.5	3.5
3.0	4.0	5.0	2.0	1.0
4.0	4.0	4.0	2.0	1.0

The results are as follows:

Code:
FRIEDMAN version 1.24 (Jan 17, 2002) [url]http://ff123.net/[/url]
Friedman Analysis

Number of listeners: 22
Critical significance:  0.05
Significance of data: 3.23E-05 (highly significant)
Fisher's protected LSD for rank sums:  20.556

Ranksums:

File2    File1    File3    File4    File5    
 87.00    72.50    70.50    60.50    39.50   

---------------------------- p-value Matrix ------------

         File1    File3    File4    File5    
File2    0.167    0.116    0.012*   0.000*   
File1             0.849    0.253    0.002*   
File3                      0.340    0.003*   
File4                               0.045*   
--------------------------------------------------------

File2 is better than File4, File5
File1 is better than File5
File3 is better than File5
File4 is better than File5

If only the 10 listeners who correctly identified files 4 and 5 are included in the analysis, the results are as follows:

Code:
FRIEDMAN version 1.24 (Jan 17, 2002) [url]http://ff123.net/[/url]
Friedman Analysis

Number of listeners: 10
Critical significance:  0.05
Significance of data: 3.75E-07 (highly significant)
Fisher's protected LSD for rank sums:  13.859

Ranksums:

File2    File3    File1    File4    File5    
 45.00    39.00    36.00    20.00    10.00   

---------------------------- p-value Matrix ------------

         File3    File1    File4    File5    
File2    0.396    0.203    0.000*   0.000*   
File3             0.671    0.007*   0.000*   
File1                      0.024*   0.000*   
File4                               0.157    
--------------------------------------------------------

File2 is better than File4, File5
File3 is better than File4, File5
File1 is better than File4, File5

ff123
 
Last edited:
Conclusions:

Even the group of 10 people who correctly identified files 4 and 5 could not distinguish between files 1, 2, and 3.

An interesting point about the analysis for the 10 "sensitive" listeners: It cannot be said that file4 is significantly better than file5. This isn't quite intuitive to me, but I'm very sure that the analysis correctly implements a Friedman analysis for ranked data and a Fishers LSD (also for ranked data) to separate the ranksums.

ff123
 
Last edited:
Ethan Winer said:
Slack,

> Let's say that your realistic noise floor is at -70db. That means you can damn near toss off 4 of your least significant bits unless the noise you're recording is desirable. <

Yes, though when there are fewer bits the distortion rises too. An 8-bit Wave file is not only noisier than a 16-bit equivalent, it's also grittier sounding.

> You know what I thought the big clincher would be? The fact that you've got 8 MILLION voltage levels to choose from between 0 and -6db in a 24 bit system. <

But that doesn't really matter because even with 16 bits the distortion is negligible. Further, consider that those tiny steps in voltage are sent to a loudspeaker, where they are translated to tiny steps in the speaker cone's position. If a tweeter has a maximum throw of 1/4 inch (and most are probably much less than that), you're talking about micron sized displacements! I don't know how many loudspeakers can be positioned that accurately.

> After a certain amount of resolution, it's probably the clock and the accuracy of the sampler that has the biggest impact on sound quality. <

That too is not the problem so many people seem to think it is. If a clock source has discrepancies that occur in the MHz. range, then the only affect is MHz. components. Which are filtered out in the D/A conversion but would be inaudible even if they weren't.

> The difference between the two mixdowns was extremely minimal. In fact I used wavelab to create a "difference" file, and the only differences were WAY down like -80db. <

My main point exactly.

--Ethan

The accuracy of the sampler is very important. If the sampler has too much jitter, the samples that are taken could be way off especially where the wavefore is steepest such as near the zero crossing of a sinewave but voice and music waveforms are much more complex and have many steep changes in voltage. The sampling accuracy is a key concern.
 
wes480 said:
I'm confused - i don't really understand binary at all...but..for example...all of these 'volume' examples...

take

1011 0001 1000 0001

and "turn it down 20db" (this isn't the real numbers..i have no clue what I am doing)

1010 0001 1000 0000

someone said something like that earlier...slack i believe.

So, I am lowering my volume...hmm...well, if all of that crap is volume - then how does digital (binary)make a violin sound like the violin? Or drums sound like drums....seems like all of the changes in 1 and 0 would have to be a lot more precise on the SOUND itself than on the volume.

Someone, explain..explain :)

The number of bits have only to do with the amplitude or loudness of the sound. When you have more bits you have more accuracy because it allows a smaller change in voltage to result in one bit change.

The sampling rate and hardware/software filtering and equalization are related to the frequency components which distinquish different instruments that you refer to. When the sampling rate is at least double the highest frequency in you music you can reproduce all the sounds on playback. Sampling rates from 44.1khz on up are fine for 20-20khz.

It is the frequencies and other related characterics coming from the violin that differ it from other instruments. The main point here is that it is NOT the amplitude that make instrument sound different. If you make a guitar louder it still sounds like a guitar.
 
hold it

Yea, can we talk about that please? I realize this is more suitable for a new thread, but you just touched on a subject that's bothering me. Exactly how important is this slight jitter between samples when recording music (not synching, or doing anything else where cumulative time delays can amount to something meaningful by the time a song is over, just plain recording). We're taking say 44,100 samples in one second, and you mean to say that some tiny amount of jitter in those samples makes a dammed bit of difference to the human ear? I wish someone could convince me.

I think this very informative little experiment with bit depth should be expanded to force everyone to take a good hard look (listen) to differences in soundcards with "good" and "bad" timing stability, as well as contrast different sampling rates. I wish I could conduct such experiment but I'm not set up for it. Now, back to this bogus jitter BS, when I look at a time domain representation of a fairly "ratty", jagged portion of my band's recordings sampled at 44.1 KHz, those steep slopes and zero crossings end up looking pretty smooth, and they're tightly packed with samples. The image below is one example, which I hope shows up. It should be obvious to anyone that any slight shifts in any number of those digital samples wouldn't mean a thing to the listener. Maybe if this millisecond or so of music were pushing the Nyquist limit out in a dog's range at say 20.05 KHz then a slightly displaced sample might imperceptively change the shape of the signal, but can any being (human or dog) register than in his brain? I'm not trying to be a wise guy; I'd really like to know whether all this fuss about timing stability is just as far past the point of diminishing returns as the 16 to 24 bit issue.

Thanks for any insight
 

Attachments

  • samples.webp
    samples.webp
    15.7 KB · Views: 59
Ethan Winer said:
Wes,

> if all of that crap is volume - then how does digital (binary)make a violin sound like the violin? Or drums sound like drums. <

Those numbers represent the voltages generated by the microphone. Which represent the changes in air pressure when the musical instrument or voice was recorded.

When a microphone is connected to an amplifier which is in turn connected to a loudspeaker, the loudspeaker reproduces the same air movements the microphone captured originally. In an analog system, the voltage from the microphone is amplified and sent to the loudspeakers. In a digital system, the voltage from the microphone is measured 44.1 thousand times per second, and converted to equivalent numbers. In a 16-bit system these numbers can range from -32767 to +32768, which can represent a lot of tiny differences in the displacement of the loudspeaker cone.

Does this help?

--Ethan

To correctly understand why a violin sounds differenct than drums you will need to understand Shannon's Sampling Theorem which falls under digital signal processing (DSP) an advanced engineering subject. Your elementary description about air movements on the microphone and speaker are true but failed to answer the question. This is probably because there seems to be few analogies available to explain this to someone without an engineering background.

So your question is obviously not related to the volume and not related to the bit resolution. It is related to the frequency content of the instrument which is related to the sampling rate and any filtering of the signal. As a user of recording equipment, you really don't need to understand how the sampling process works in order to get good recordings. If you do need to understand take a DSP class at the university.
 
BasPer said:


This is not true. There are absolutely nothing preventing you from transfering information below the noise level. On the contrary, the noise helps to increase the resoution. If there was no noise it would be impossible to detect differences of less than 1/2 LSB; but with noise the voltage resolution can be better than the resolution of the A/D converter.

Then why don't manufacturers purposely generate noise in their products in order to capitalize on your wonderful idea?
 
ff123 said:
Conclusions:
An interesting point about the analysis for the 10 "sensitive" listeners: It cannot be said that file4 is significantly better than file5.

That's annoying !
All of them ranked File4 2.0, and file5 1.0. Every listener agree, what more does your program want ?
 
Steve Walker said:

Please note that a 6 db drop is 1/2 of the voltage into the 24 bit analog-to-digital converter (ADC).

db = 20 log (V1/V2)

This means that 1/2 of the 24 bits go to zero and therefore a 6db drop affects 12 bits not 1 bit.

Good joke ! And when you cut 44 in half, you get 4 :)

24 is the number of digits, not the number of levels.
 
That's annoying !
All of them ranked File4 2.0, and file5 1.0. Every listener agree, what more does your program want ?

Looks like it wants about 10 more listeners like that! One disadvantage of a ranked test like this one is that the resulting analysis is not as powerful as it could be otherwise, and it shows for small sample sizes.

It would have been better to ask that listeners rate each sample on a 1.0 to 5.0 scale rather than to rank them in order of preference. That way, I could have used a more powerful analysis.

BTW, the formula for calculating whether one setting is deemed different (at 0.05 significance) from another is rather simple:

LSD = 1.96 * sqrt(b*t*(t+1)/6)

where b is the number of listeners = 10
and t is the number of settings being evaluated = 5

If LSD is less than the difference between the sum of any two columns, then the two columns are significantly different from each other.

In the case of the 10 listeners, The sum of the File5 column is 10 and the sum of the File4 column is 20, so the difference is 10. LSD is 13.9, which is greater than this sum, so no significant difference. The difference between the columns doesn't exceed the LSD until the number of listeners is 20. Another way to get a higher difference is to have a people rate File4 higher than just #4, which is apparently what happened for the bigger group of 22.

ff123
 
Last edited:
One more point, If File4 and File5 were compared directly against each other, with no other files in the test, then only 6 people in agreement would be required (using the LSD formula) to say the two files are significantly different. The moral? Fewer files to compare at the same time means more powerful results.

ff123

Edit: before somebody asks, I already tried eliminating File4 and File5 from the analysis of the 10 sensitive listeners. It does show then that File2 is preferred over File1 with a confidence > 95%. Problem is, I'm not real comfortable with doing this. The proper method in statistics is to conduct another experiment to confirm results like this.
 
Steve,

> Please note that a 6 db drop is 1/2 of the voltage into the 24 bit analog-to-digital converter (ADC). ... This means that 1/2 of the 24 bits go to zero and therefore a 6db drop affects 12 bits not 1 bit. <

Each bit accounts for 6 dB. Reducing the volume by 6 dB. is the same as shifting all of the bits to the right one position. Only the single right-most bit is dropped.

--Ethan
 
Steve,

> The accuracy of the sampler is very important. <

No doubt.

> If the sampler has too much jitter <

How much is too much? 0.001%? 0.0001%? Is jitter really an issue with modern digital gear? I can't see how. Especially compared to all the cumulative distortions from room acoustics, speaker distortion, and so forth.

--Ethan
 
Steve,

> To correctly understand why a violin sounds differenct than drums you will need to understand Shannon's Sampling Theorem <

I am not a math-head, but I assure you I know exactly why a violin sounds different from a drum. :)

> Your elementary description about air movements on the microphone and speaker are true but failed to answer the question. <

In fact, that "elementary description" represents all that is important for the subject at hand, and answers the question completely. Sound emanates from a source, and the task at hand is to capture that sound and eventually have a loudspeaker vibrate in the same way so as to reproduce the original sound. It doesn't matter whether the original sound is a drum or a violin, and the original question didn't ask why a drum and violin sound different.

What was asked is how audio can be represented as a bunch of numbers. Whether the audio is encoded as a series of numbers that represent the original changes in air pressure, or as different degrees of magnetization on oxide for the same purpose, the point is to capture the changes in air pressure, and store them for reproduction later.

--Ethan
 
I just spent 15 min writing something only for it to get erased so I'll keep it short.

Converters come in different sizes and different shapes.

It has very little to do with 24 versus 16 but what is the quality of the converter.

Every manufacter places his converter on a specific cicuit using a specific software. All this effects the end result.
Place the same converter on a different card using a different software and you have a different result......(Go make a point of 24 versus 16.........)
I have heard a 20 bit card kick a 24 bit card's ass more then once.

Does the converter output info to 8 outputs(another good joke by home recording soundcard manufacters)?
To 4 outputs? (is it still linear?!?!?) or to single outputs !
Can it handle the pushing of info to 8 outputs or does it sound "Stuffed" or worse sputters and krechs......

Work with a cheap converter in 24 bits and then move to a good converter after a few months of work and "YOU TO CAN HEAR THE
DIFFERENCE...JUST DIAL 1-800-24-24-WHOOHEEE"

I'm not sure your short clips can actully prove your point.
Even more so using a Delta66 (with all due respect), although your question is extremly legitimate and should be made a point.

And.....your cat is way to fat......
 
I still don't see why this thread is titled 24 bit challenge. There was no 24bit file represented in the test. The highest was 16bit. Again, the test was only to determine whether you can tell the difference between dither and truncation. However most listeners couldn't hear the difference between 12 and 16bit either (on this selection). But once again - no 24bit example here.
 
Last edited:
Back
Top