The 24-bit challenge

Do yourself a favor, get off the internet and go to the gym and hang out with some men. And watch out for the guys who bring a towel to wipe down the equipment before they'll touch it.
And be carefull when you bend over - unless you like that sort of thing.

windowman, insulting people on a message board does not nescessarily make you a man. Neither does going to the gymn.



Anyway, one of the benefits of 24 bit is that any processing you do on the signal starts with a higher resolution sound. This should lead to better sounding effects. It would be a good test to put the 16 bit file through a a sequence of effects, and compare it to the 24 bit put through the same sequence. While the straight up sound difference between a 16bit and 24bit source may be very subtle, the difference in a processed signal should be more obvious.

Another good test would be to mix a 16 track song starting with both 16 and 24bit source files - comparing the difference. I have often wondered if the benefits of higher bit depth in the source file are lost when mixing several files together. I guess if you can prove that each track sounds better after the processing of its track specific inserts, then it's going to sound better going into the mix.

I also agree that putting the files in MP3 format completely invalidates the tests. The difference in sound quality between wav and mp3 is uncontested, and is likely to mask a more subtle difference in sound quality.

For instance, if you add a hiss at -60db, it won't make a difference if the original signal has a 90db or 148db dynamic range.

As annoying as it may be, better to post wav files.
 
mcr wrote:

I got three right and two wrong ... which is best explained by luck. Statistically, significance is usually set at 5% or 10% levels ... hence, in order to show that your decisisons are more than mere guesses you should be able to have all FIVE right. Getting four right out of a sample of five is not enough and would not be considered as proof by any scientist.

Actually, if there is just one file which is easy to hear problems with, but the others sound the same, then it would be appropriate to get only one correct identification out of five.

For an individual listener, the best way to show that your choices are not merely the result of chance is to perform ABX (double blind) tests (http://www.pcabx.com) against a reference file. However, Ethan's test is a little problematic because a reference file is not identified.

Given that, the best listening strategy would be to perform an initial listening pass on all the files (played all the way through, of course) to figure out if you can hear a problem with any of the files. From the remaining files (which should be audibly indistinguishable from each other), choose a reference file, which you can then use to perform ABX testing.

For a group of listeners, the best way to perform a statistical analysis is with some sort of mutiple testing method, like the one I wrote and mentioned in my earlier post. BTW, another problem with Ethan's test is that most people will listen to the files in the same order (i.e., 1 through 5). It would have been better to randomize the listening order.

The biggest problem with Ethan's test is that many people won't know what to listen for. It would have been nice to include an even lower bit resolution (!) so that the problem is very obvious to start with, but progressively gets more subtle. That way, people can learn to hear what bit truncation sounds like while they take the test.

Still, I know that some listeners will be able to hear the effects of 9 and 11 bits. It will be interesting to see if the group of people who took the test can statistically distinguish the 13 bit file (I have no illusions that the 16 bit files can be distinguished from each other).

An interesting bit of probability: if you are able to identify the 9 bit and the 11 bit files, and then just guess at the remaining three, you have a 2 in 3 chance of getting at least 3 correct overall, and a 1 in 6 chance of getting them all right!

ff123
 
"does not nescessarily make you a man. Neither does going to the gymn."

Ah come on, have a heart, at least it'll make me "more" of a man if there's more muscle on me won't it? More of me to love, more of me to fight or maybe just plain more of me. Besides, I promise you'll get more dates with 20 inch biceps. That oughta be worth a trip to the gym an hour everyday don't ya think? Beats hanging out on the internet anyway, although I try to visit that strange world at least a half an hour per day or however long it takes me to get a rise outta somebody. Kinda like you. :D

I wouldn't be to quick to call the differences between a wave and an 128k MP3 uncontested BTW. I did a similiar test like this involving MP3's a couple of months ago that fooled nearly everybody. People all think they have "special" ears though so what's a guy gonna do? I'm a born skeptic though so I can't resist testing people's claims. It'd make a good article for me to write sometime for Skeptic Mag except they're more into ghosts than music. (sigh)
~~~~~~~~~~~~~~~

My other hobbie--literary analysis.
The Dark Tower
 
I think you're missing something really important here windowman. Nowadays females tend to go for those adorable little weaklings...you know Ricki Martin, Justin Timberlake, etc. Of course I always refer to those kind as a little 3 letter word that starts with g and rhymes with day :D
 
There seems to be a difference.

I haven't listened to the test yet. But I did compare a 16 bit Sr-16
drum machine with an 18 bit Dm5. And the 18 bit samples sounded
about twice as "clean", has more luster and brilliance. But I doubt
the difference is worth losing sleep over. If you have a choice maybe
go for the higher bits. But if you're stuck with 16, don't wine about
it and forget that making good music is the reason we got into
home recording in the first place. Not gear collecting.

I'd also like to utter praise to the Windowman for such a balsy
monologue. He speaks the truth.
 
I'd also like to utter praise to the Windowman for such a balsy

I agree it is worthwhile to challenge generally accepted theories - like 24bit being better than 16bit. It's stupid to go around calling people nerds and sissies just because they don't agree with you -or even if they like to maintain a high level of hygiene at the gymn.


Look - I'm all for being in good shape. But does it help with women? I don't think so - not too much anyway. Money and popularity on the otherhand ... As far as Ricky Martin - I'm shure he has a private gymn and a personal trainer - a makeup man and coifure also.
 
Ethan: when are we going to see some results? I expect a random uniform distribution across the choices, but I will be delighted if the results show otherwise, and it might just put a long debated issue in perspective.
There should be enough responses by now to give a statistically meaningful update. After all, you owe it to everyone who took the time to download those files and participate in the test. (a little arm twisting)
 
i picked correctly for the 16 bit dithered file...the rest i was wrong....so dithering does make a diff...to me anyway :D
 
The results are in. Here is how I judged (I won't release actual answers publicly until Ethan does):

I started by listening to all the files in order to see whether I could pick out the two worst files, which turned out to be not too hard.

X was obviously the worst, with noisy grunge apparent especially at the end. There was no need to perform an ABX comparison (http://pcabx.com/) because the difference from the others was very obvious.

Y was next worse, but much better than X. There is slightly more hiss than the others at the end. In an ABX comparison against Z, I scored 8 out of 8 correct (probability of getting those results by chance: 0.004)

The next three are all bunched together, and I don't think I can tell any of them apart. On quiet music, I can usually hear the hiss associated with 13 bits, but this sample has a lot of signal, so I don't.

Equipment: M-Audio Audiophile 2496 card, Grado SR325 headphones plugged in directly to the card

I would like to see a scoring matrix like the one I described in my first post, so that group statistics can be run.

ff123
 
Teach,

> i picked correctly for the 16 bit dithered file...the rest i was wrong....so dithering does make a diff...to me anyway :D <

You're kidding, right? If the rest were all wrong, a much better conclusion is that you heard no real difference. One thing to try - but I won't because the files are so large - is to have everybody do it again just to see if they get the same answers a second time. That would be more meaningful.

--Ethan
 
Ethan wrote:

You're kidding, right? If the rest were all wrong, a much better conclusion is that you heard no real difference. One thing to try - but I won't because the files are so large - is to have everybody do it again just to see if they get the same answers a second time. That would be more meaningful.

That would be a bit more meaningful, but not by much. You'd have to repeat the test about 5 times, and Teach would have to get the same answer 5 times out of 5 before he could say with better than 95% confidence that he heard a difference.

Far easier just for Teach to perform an ABX test pitting the two 16-bit files against each other.

BTW, I emailed you about getting the raw data (stripped of information identifying each listener) to perform an analysis of it.

ff123
 
ff123, you said a couple of things which I hope you will clarify. Earlier, after stating you picked out the worst two files you said, "I scored 8 out of 8 correct (probability of getting those results by chance: 0.004)".

And in the last post you said, "Teach would have to get the same answer 5 times out of 5 before he could say with better than 95% confidence".

What kind of pdf did you assume to compute these confidence bounds, and how did you arrive at 8 out of 8 (when there were only 5 files) and the corresponding probability of 0.004?

thanks
 
ff123, you said a couple of things which I hope you will clarify. Earlier, after stating you picked out the worst two files you said, "I scored 8 out of 8 correct (probability of getting those results by chance: 0.004)".

In an ABX comparison (http://www.pcabx.com/), one can listen to three files: A, B, or X. I assigned A to be the second-worst file (the 11 bit file) and B to be what I chose to be a reference file (it just happened to be the dithered 16 bit file). The ABX program randomly assigns X to be either the same as A or the same as B. Then the listener decides whether X sounds the same as A or the same as B. I chose X correctly 8 times out of 8 times that I tried it. The probability of this occurring by chance is 0.5^8, or 0.004. I have a program which can calculate this probability for various combinations of correct identifications and trials performed using Monte Carlo simulation at:

http://ff123.net/abx/abx.html

And in the last post you said, "Teach would have to get the same answer 5 times out of 5 before he could say with better than 95% confidence".

Oops. I made a mistake and mixed my ABX probabilities with guessing from a group of 5 choices. If Teach was randomly guessing which one was the dithered 16 bit file, then if Ethan reshufled the files, only two correct identifications out of two tries would be needed to yield better than 95% confidence: (0.2)^2 = 0.04. The 5 out of 5 is score needed to get (0.5)^5 = 0.03 on an ABX test. My bad.

ff123

Edit: The above probability only holds if Teach was truly selecting randomly from a pool of 5 files. If he was able to eliminate one or two of them from the pool (as obviously not being the 16-bit files), then more correct identifications would be needed to bring the confidence back up to 95%. For example, imagine that Teach was able to eliminate the 9, 11, and 13 bit files and had just two files from which to guess which one was the 16-bit dithered file. Then two correct guesses yields only (0.5)^2 = 0.25. We're back to the situation where 5 correct guesses out of 5 are needed to get > 95% confidence that the results were not by chance.
 
Last edited:
ff123,

> You'd have to repeat the test about 5 times, and Teach would have to get the same answer 5 times out of 5 before he could say with better than 95% confidence that he heard a difference. <

There you go! :)

> BTW, I emailed you about getting the raw data (stripped of information identifying each listener) to perform an analysis of it. <

I almost offered that before you asked, after reading your desire to analyze the results. But I was too lazy. So I'm glad you asked, because I too would like to see a more comprehensive analysis.

Please post the result here and anywhere else when you're done.

--Ethan
 
I tried to post this a couple days ago when the board was down :


Sorry about complaining about the mp3 files. I just read some old posts and never bothered to check.

I took a listen. 2 are obvious (as others have noted), the other three sound pretty similar. At least I'm not going to hear much difference without a good pair of headphones or monitors - and I'm not at the studio. When comparing the 2 16 bit files, all you are comparing is the effect of dither. To hear the difference between 24 bit and 16bit, you must compare 24 and 16. The effect of dither is very subtle, and most noticeable with very quiet sounds that are close to the noise floor. There's no reason why anyone should be able to hear it well in this recording. Check out the documentations for the Waves L1 for a good lesson in hearing dither and noise shaping.


-------------------------------

It is scientific fact that the average human ear can hear and discriminate sound at levels more than 96db apart. This information is the result of precise laboratory testing. This indicates that the average person can discern the difference between a 16bit and 24bit soundfile if the conditions are right. The conditions involve factors such as what kind of sound is encoded in the file, and what kind of equipment is being used to record and play the file. If the difference is masked by the ambient noise of a recording studio, that is another matter.

What it comes down to is sound quality. The 24 bit converters on the market today sound, for whatever reason, better than the 16bit ones. So if you record to 24bit, and then reduce to 16bit - with either truncation or dither - it should still sound better than recording straight to 16 bit - especially as you will be benefiting from the lower noise floor (higher headroom) of 24 bit during the recording process.
 
Ethan stripped the emails of any identifying comments, such as names and email addresses (except for my email, which points back to my website). So I posted the raw comments here:

http://ff123.net/24bit/24bitcomments.html

I need more information from the person who wrote this comment:

I saw your petition for your "24-bit test" on ProRec. Here's my
ranking:

File1 - 3
File2 - 4
File3 - 5
File4 - 2
File5 - 1

I assume this is ranked #1 being best and #5 being worst. Please contact me if this is incorrect (miyaguch at eskimo dot com)

ff123
 
You're kidding, right? If the rest were all wrong, a much better conclusion is that you heard no real difference. One thing to try - but I won't because the files are so large - is to have everybody do it again just to see if they get the same answers a second time. That would be more meaningful.

i don't think so considering their were 2 files that sounded distinctly different in my opinion... 16bit dithered and the supposed truncated 9 bit. but we would have to conduct this test a few more times for it to be SCIENTIFICALLY 'valid' but hey this is music not SCIENCE....so fuck it besides scientifically speaking the dithered file is supposed to sound better anyway ;)

also i'd like to think that this showed that truncation does more harm then volume changes
 
Teacher said:

also i'd like to think that this showed that truncation does more harm then volume changes

I don't think so. According to the raw comments posted by ff123, link above, among the people who properly spotted the 9 and 13 bits samples,
5 prefered the truncated version,
at least 2 prefered the 13 bits version (there is someone we don't know if he ranked from worst to best or best to worst),
and 1 only prefered the dithered version.
I didn't take into account the results saying "no difference" between the 3 remaining samples..
 
fuzzy?

Hey ff123:

you said, "imagine that Teach was able to eliminate the 9, 11, and 13 bit files and had just two files from which to guess which one was the 16-bit dithered file. Then two correct guesses yields only (0.5)^2 = 0.25. We're back to the situation where 5 correct guesses out of 5 are needed to get > 95% confidence that the results were not by chance."

With 3 files eliminated and 2 remaining, there should be a 0.5 probability, no?
I don't believe we're assuming file replacement, are we?
 
Back
Top