Dithering question

  • Thread starter Thread starter EightMilesHigh
  • Start date Start date
Again I agree with your statement MS. I have no problem with someone saying that they like the sound of truncation or a given type of dither over another. Some like the sound of clipping for certain types of music too. This is a taste issue. But to say that you can't hear a difference, or one is not needed and is pure hype propogated by software companies or manufacturers of converters, well that doesn't ring true for me.

In the FWIW department I just ran a blind test with my associate engineer using a Hip-Hop track where I switched between truncation, and various types of dither. She could tell the difference immediately. And as you an Glen said the differences may suit one type of mix better than another. In general dithering gave more detail to this track.

As MEs we are told to generally keep the processing as transparent as possible and not to color the mix, so we usually tend to be conservative with this sort of thing. Then again there are cases where the mix is better suited by "coloring" it and I'll run this by the client if they are at the session. I've had situations where a client has told me that "cleaning up" a mix reveled too many flaws in his recording and wanted things a bit on the muddy side to cover it up. In a situation like this detail may not be a desirable outcome. There are cases where harmonic distortion adds an element of aggression (certainly the case with a distorted guitar). For a Punk band truncation may be a cool effect, as an example I had a Punk band that once told me the master sounded "too good" and to "fuck it up".

Different strokes, but they are audible strokes.
 
I'm not really in total disagreement with all that you said except that one has to ask is there something there that I may not be hearing today for a variety of reasons?
What I *honestly* don't understand, Tom, is how an ME can perform their job for what they "might not be hearing". This makes little sense to me in a couple of ways:

First the whole idea behind all the ear training, mastering suite design, $10k audiophiliac monitors, boutique gear, etc. is to hear what is there as accurately as possible, certainly with far more accuracy than the highest majority of playback systems and ears out there in the field. If you can't hear it there, how can it be heard anywhere else? That's what us mix engineers are paying you mastering guys for ;).

Second is the assumption that what one doesn't hear is an improvement over what one does hear, or at least would be if only one could hear it. This assumption basically says that dithering is *always* - 100% of the time - better for the client and the production than non-dithering is. In this modern world of production values where no property of the production from dynamic range to signal distortion is sacred or immune from the fangs of the producer or client, how is it that dithering is always the appropriate sound and answer all the time for every production? Especially in those cases where one's ear cannot verify it? I can't think of one other single processing effect that's done strictly for sound quality reasons that is applicable, necessary and desired 100% of the time.
BTW one last comment. Dithering is more than adding noise, it helps to remove harmonic distortion caused by quantization.
Technically, it IS noise, just not in the traditional analog sense in which we're used to thinking. Remember we're talking digital information here, and this is all an offshoot of information theory, which roughly defines noise as any signal that does not itself carry any information.* This noise is actually another form of quantization distortion that is added to the QD of truncation. This additional distortion can sometimes yield test measurements that indicate lower HD numbers, sure, but when I hear that used as an argument for the use of dither, I flashback to the late 70s/early 80s when there was the last big debate over harmonic distortion.

That time the debate was in the form of HD measurement specifications in audiophile amplifiers, integrated amps and receivers. There were spec wars back then that kept pushing HD numbers lower and lower in this gear. It wasn't unusual to find a mfr boasting a HD level of 0.0005% in expensive full-page advertising campaigns and claiming that they had to sound better than their competitor who's amp spec'd out to a full 10 time noisier at 0.005%. It was all eventually deemed to be a bunch of baloney when rigorous testing showed not only that the audibility of HD was entirely frequency- and content-dependent, but that on average anything below 0.1% or so was either completely inaudible or so swamped by S/N or intermodulation (IM) distortion or any of a raft of other factors as to be rendered meaningless.

OK, Tom, I can completely understand the audiophiliac desire and ethic to which many mastering engineers subscribe to make things a good as possible. But there comes a point where it can become pragmatically unnecessary. We're lucky with dither in that exercising the process is the matter of a non-destructive mouse click, so it certainly doesn't hurt to give it a try. But to default to it, flipping bits when they simply don't matter (when you are not sure of what you hear, even with the very, very best of gear and training), just is not a pragmatic choice, IMHO.
-----
One last analogy. We often hear the term around here that "this is not rocket science" (or, jokingly, "rocket surgery" :) .) What if it were? Then we'd obviously be using every last decimal point of precision we could muster and get things as right as possible, right?

Not necessarily.

The amount of "correction" (actually more accurately, "fudging") that dither applies to truncation is akin to the difference between the accuracy of Newtonian mechanics and Einsteinian relativity. By using relativistic equations, we can achieve a level of accuracy in our calculations and theories we simply cannot get with the old apple falling from the tree of Newton.

So when the engineers at JPL and NASA program their computers that control their spacecraft to the moon, Mars Jupiter or even Pluto, they load their navigation software up with Einstein's equations, right? NOPE. It's all the same basic Newtonian force=mass x acceleration stuff, used as well for the Mars Pathfinder mission controllers as by WWII artillery officers. The level of refinement offered by relativistic mechanics simply is not needed at the scale and speed of spaceflight (not until we figure out warp drive, anyway ;))

Nor is (usually) the refinement of dither often necessary in the macro world of musical reproduction. Perhaps when working on laboratory calibration and refined scientific experiment-quality audio, such refinement matters, but without it the music will get to our ears just as fine as the Voyager probes made it to the outer planets.

If you guys want to continue clicking that mouse, be my guest. I just ask that anybody who does be honest with themselves as to why they are doing it.

Is it actually because it is a real improvement in the resulting production? If so, fine. If not, then why?

G.

*Well, to get really technical, most dithering algorithms are not random or even pseudo-random, but rather follw fairly simplistic patters. In this way they do themselves convey the information of that pattern, but it is information that is not directly relevant to, and smudges, the information within the signal to which it is applied. it is in fact this "smudging" of information that folks call the "removal of harmonic distortion" or "de-correlation" of the sample information.
 
Last edited:
Well the thing about HD is if you've got that, you almost certainly have IMD as well. But nobody ever developed an official spec for intermod even though it's relatively easy to test. The general goal of an amplifier design should thus be to minimize all distortion, unless there is a specific reason for wanting distortion, for example a 'color' amp that does lots of second order.

That's a bit more complicated when it comes to QD, because it works counterintuitively to distortion in an analog circuit. Had I played two adjacent notes on my bass, I would expect two things: a slight increase in analog distortion from my amplifier, due to IMD, and a decrease in QD due to an increase in signal complexity. The increase in analog distortion would probably make to difficult and perhaps impossible to note what happened to the QD. I mean, -120dBFS is small to begin with, and would be easily swamped by any increase in analog distortion.
 
And here's the picture of that, technically three bass notes because my open low C is vibrating in sympathy a wee bit. Above 1kHz, it's pretty much just an increase in noise. Between 200Hz and 800Hz there are some harmonics that are higher truncated, but also some that are higher at 24 bit. The differences live between -116dBFS and -126dBFS, I made this big so you could kinda see that:
 
Boy, that looks bad, let's try it here:

two_bass_note.GIF
 
Well the thing about HD is if you've got that, you almost certainly have IMD as well. But nobody ever developed an official spec for intermod even though it's relatively easy to test. The general goal of an amplifier design should thus be to minimize all distortion, unless there is a specific reason for wanting distortion, for example a 'color' amp that does lots of second order.

That's a bit more complicated when it comes to QD, because it works counterintuitively to distortion in an analog circuit. Had I played two adjacent notes on my bass, I would expect two things: a slight increase in analog distortion from my amplifier, due to IMD, and a decrease in QD due to an increase in signal complexity. The increase in analog distortion would probably make to difficult and perhaps impossible to note what happened to the QD. I mean, -120dBFS is small to begin with, and would be easily swamped by any increase in analog distortion.
All true.

The question imbedded in all that (for me, anyway) is the question of relevance. When anything below 0.1% distortion (whatever the nature of the distortion) - or let's push it a whole 10x stricter and say 0.01% - is effectively inaudible, the push to refinement to, say, 0.0005% - an increase in accuracy of 500x - becomes a distinction without a difference. It's akin to making sure that your steak knife is sharpened down to a single molecule edge. It really isn't going to cut your steak any better.

Compound that with the "swamping" effect, that there are going to be so many other "noise" factors of greater measure or more humanly noticeable character (or both) that just mask that refinement to ineffectiveness.

And then we have the final "swamping effect" of moving things to the realm of amateur or prosumer recording and out of the realm of the ME with the $50k transporter console. It may be unfair and incorrect to describe dithering your average home recording made on a $1500 DAW as "lipstick on a pig", but the long, long list of potential higher magnitude "issues" that will totally swamp whatever minuscule positive effect dithering may have makes (IMHO) home dithering such a small, small issue that it's a crying shame that we have to devote so much forum bandwidth to debating it. ;)

G.
 
Let's try to reproduce that experiment with test tones and see what we get. I took two triangle waves, at 60Hz and 80Hz, and used a couple of low-pass filters to limit the harmonic series pretty similar to what you see in my bass recording.

In the first picture, you see a comparison of these noise-free notes at 24 bit and truncated to 16 bit. The truncation does display some QD, notably around 5kHz where unlike the general noisiness of the rest of the QE spectrum, that's going to sound like a noticeable hash. That's bad, but certain nowhere near as bad as a -50dBFS sine wave example looks and sounds.

Next, I added white noise to the triangle waves, to simulate a reasonable level of electronic noise. It shows up at the -135dBFS level on the FFT, which integrates to about -105dBFS. Let's think about that noise level--if your converters have 115dB dynamic range--unweighted--and you peak at -10dBFS when you record, that's about where your noise will live. That's pretty close to what I achieve when recording my bass DI; that was into an ART Digital MPA, which is relatively quiet, plus I have an early model with the AKM5394A chip, which is still their top range. I peaked at about -5dBFS for that sample (although again, I showed you the end of the fadeout, not the peak).

OK, so now we have a test tone which is going to be about as good as any analog source can get. I charted the earlier truncation against the truncation with added noise. Note the overall decrease in QE, I didn't chart that against the 24 bit but you can eyeball it. Anyway, the 5kHz hash is reduced from -116dBFS to -123dBFS with the small amount of added noise.

Obviously that noise wasn't quite enough to fully dither the signal, but it had a very measurable effect. The remaining QD peak at 5kHz is less than 0.0001%. From my point of view, that is the worst possible case QD from any electronic source, and I would venture that QD on an acoustic source probably has to be even less.

If you are only using softsynths, well you might need to dither lest you suffer much worse distortion.
 
All true.

The question imbedded in all that (for me, anyway) is the question of relevance. When anything below 0.1% distortion (whatever the nature of the distortion) - or let's push it a whole 10x stricter and say 0.01% - is effectively inaudible, the push to refinement to, say, 0.0005% - an increase in accuracy of 500x - becomes a distinction without a difference. It's akin to making sure that your steak knife is sharpened down to a single molecule edge. It really isn't going to cut your steak any better.

Yeah, definitely a point of diminishing returns. The other thing is people goose those specs by not quoting the level tested. Any THD spec that doesn't state a tested level--it should be the nominal level of the box--goes straight in the trash in my book.

Although if you have low levels of THD, say the 0.01%, and you start accumulating it across many boxes in a chain, then you end up with what I very technically call "hash". The weird thing is you can't predict how people will like or hate your hash. You see this all the time when somebody mods some box to make the amps faster and with less distortion. They talk about the effect on the lows, when we know there should be little effect directly there, but the psychoacoustic effect of less high-end accumulated hash is better definition in the lows. But then some people don't like "fast" lows. I say it's always easy to slow things down later, harmonic distortion is not hard to add . . .
 
This noise is actually another form of quantization distortion that is added to the QD of truncation. This additional distortion can sometimes yield test measurements that indicate lower HD numbers, sure, but when I hear that used as an argument for the use of dither, I flashback to the late 70s/early 80s when there was the last big debate over harmonic distortion.[/SIZE]
Does the dither add to the QD or replace it?
 
Lets forget about dither and add 2 tracks of noise versus 35 tracks of noise. When summed together which has the higher noise floor?

The intuitive answer is the mix with 35 tracks. But the correct answer is it depends on the level of the noise in the tracks. And nobody records and mixes only noise anyway. More to the point, the ambient noise floor in the room generally swamps out truncation distortion by 20 dB or more.

it's my belief that quantization distortion can be heard even above -90dBFS.

Then please take my "dither challenge" HERE and tell us which sections are dithered and which are not. This will tell you right away if your "belief" is correct.

why do so many other people hear the differences in noise shaped dithering algorithms that difference is also below -90. And I'm not talking about "golden ears"' but about about clients that come here to the studio.

On normal music recorded at normal levels? Nobody can tell the difference because of masking. Some people think they can tell, but as soon as you do a proper blind test all of a sudden they can't tell anymore.


This excerpt from the site explains it all:

54 dB of gain have been added AFTER dithering

Well, duh, of course you can hear the difference if you crank the gain by 54 dB! But then you'd blow up your speakers (and ears) when actual music is played. So while this may be a technical curiosity, it's not based in reality.

--Ethan
 
Does the dither add to the QD or replace it?
Each, both or neither, depending on how you want to parse words, I suppose :o.

Think of it like this: "Quantization Distortion" is little more than a fancy phrase for sample inaccuracy.

In this case, we're referring to the amount of accuracy to which the digital sample represents the original actual analog value. Quantization distortion exists any time the analog value cannot be precisely represented by the number of digital digits provided.

When we say that the value of "pi" is 3.1415926, that's pretty accurate, but it's not 100% accurate, because the exact value of pi can only be represented digitally (numerically, that is) with an infinite number of decimal places. So there is some degree of quantizaton error in the value 3.1415926.

When we truncate that value to four decimal places - 3.1415 - the level of inaccuracy, and the amount of QE increases. This is analogous to the increase in QE we find in our audio samples when we truncate from 24 to 16 bit.

Now, when we dither, we are pseudo-randomly changing the value of the last digit. 3.1416, 3.1414, 3.1415, etc. It really is not removing or lessening error; over enough samples, the amount of error statistically remains pretty much the same. We're not changing the cumulative overall accuracy, we're just statistically "smearing" it around a little.

We're adding "informational noise" to the last digit of accuracy, but it's not really changing the overall amount of accuracy, just it's "flavor" from a flavor of hard truncation to a flavor of dithered truncation.

G.
 
Last edited:
What I *honestly* don't understand, Tom, is how an ME can perform their job for what they "might not be hearing". This makes little sense to me in a couple of ways:

First the whole idea behind all the ear training, mastering suite design, $10k audiophiliac monitors, boutique gear, etc. is to hear what is there as accurately as possible, certainly with far more accuracy than the highest majority of playback systems and ears out there in the field. If you can't hear it there, how can it be heard anywhere else? That's what us mix engineers are paying you mastering guys for ;).

Glen -

I think that you are taking my comment out of context, I said if "one" doesn't hear meaning engineers who are new to the industry and may not have developed critical listening skills, or people with compromised listening environments who may not be hearing an effect. Judging from the responses of folks I work with most think these sort of things are audible, and I know of no pro ME that thinks otherwise. I know that you probably don't care about that, but as I respect the work of these folks I do.

As far as tests, the only valid listening tests that I know of are blind AB/X tests where you hear A then B and determine if X is A or B within a degree of probablity. You have of have a frame of reference for comparison when doing tests like this. If someone wants to create this I'm all in, personally I likely won't have time to do this for a while but may put something together for my class in the fall.

Well anyway, I really don't want to get sucked back into this again as I said above. I takes away from my billable time :-)

Best,
Tom
 
Judging from the responses of folks I work with most think these sort of things are audible, and I know of no pro ME that thinks otherwise. I know that you probably don't care about that, but as I respect the work of these folks I do.
What I do care about Tom, is the idea that none of you guys find the amount of suspicion that you should find in that fact.

The idea that whether something is audible or not or even is relevant or not is so strongly correlated to profession should raise a red flag or two right there that there's something more than just critical listening skills going on. Don't you honestly think that there's a little bit of "the guy with the hammer sees every problem as a nail" going on here too? You guys can't even agree on the sample rate debate, or the whole jitter issue, or even which monitors or preamps sound the best. Yet dither - an effect which if it were not such a marginal effect would not generate so much debate - is a slam dunk?

And it still does not address the question as to why the audible difference is always a positive difference. This is contrary to every other type of process or effect in our trick bags, where there is never an instance where a particular sound or effect is always -100% of the time - appropriate. Yet dither is an automatic process. Have none of you guys ever thought that maybe for some productions that leaving the harmonic distortion of truncation in there actually *serves* the production? If not, why should that deviation of the signal be any different than any other distortion or deviation in the signal?

And please don't paint this as disrespect for you or your colleagues or the art of mastering. You should know better than anyone here that I have the highest respect and active defense for what you true professionals do - a respect that I curiously find evenly reciprocated to us mixing guys in only a handful of mastering engineers, BTW. But you also know that I call a horse a horse and a cow a cow and don't hold back when I do, and there's partisanly more to the ubiquity of mastering engineer opinion on this subject than just science and critical listening skills.

G.
 
I've given myself a puzzle this afternoon. Realizing that I didn't subject my bass note to a long fade, as I did the organ notes, I went ahead and did that. One thing to note in Tom's sine wave at -80dBFS is it generates QD peaks as high as -104dBFS. That's audible, no question, at a calibrated listening level on headphones.

But what about my bass note? I only held it to -50dBFS, because I felt like that was it, and I was getting bored. Had I held it longer, the electrical noise would not have changed, so it would still dither to the extent it was able.

But not on a fade out using a DAW; there the noise and the signal drop together. Going back to my organ pipes, the fade still never caused a QD peak above -140dBFS, which I consider inaudible at any monitoring level.

First, I tried the simulated triangle wave + noise sample. That measured easily audible QD peaks at -116dBFS. Next, I moved on to the actual bass note. With the fade applied, the last second averaged out to -90dBFS peak on the bass note. There were QD peaks at about -122dBFS; still at the same frequencies as harmonic distortion. You would think that wouldn't be audible against the -90dBFS fundamental, but because of good ol' Fletcher-Munson, in perhaps the last 100 msec I can't hear the fundamental anymore, and all that's left is QD. Sounds like a pitch-shift effect.

So then I turned on the dither, no surprise, no more QD peaks. But unfortunately the noise was as objectionable as the QD; it didn't appear to change sonically as the QD did, but it was loud.

Then I started to wonder what was going on, I never noticed either of these phenomena before.

So I noticed I actually had my calibration point set +6dB (I was doing high SPL tests earlier today). When I turned the cans back down to the proper level, neither was objectionable.

Thus, the puzzle--what to do? This was again the UV22HR noise-shaped dither, about as quiet as they get. And there are more than a couple of tunes that end with a bass note fadeout. If you don't dither, you can hear the QD briefly. If you dither, you can hear the abrupt change from dither to the 2 second black in between tracks. You can't fade after dither, that creates the same problem, even if all you are fading is the dither.

Again, this is only on too-loud headphones. But what to do? No real solution I can see, other than leave it at 24 bit . . . I guess you can stick dither noise in using audio-in-pause . . . I don't think most MEs are doing that, though I could be mistaken. I think I've tended to leave dither on the whole master chain, so it should be there in my pauses. I like crossfades anyway . . . Consumers might never notice if their D/A and/or headphone amp aren't particularly quiet.

I did say in the beginning that 16 bit didn't sound like 24 bit, no matter what you do. Information is lost, so it's just a matter of choosing your poison sometimes . . .

OH! Also, here's something I never noticed in four years of using WL5: The Fade tool under Process only applies 16 bit fades, apparently without any dither, even on a 24 bit file! Yikes! I don't think many people use that for actual editing, but I use it for screw-around tests all the time :eek: Probably I bet WL6 doesn't do that . . . but I've been holding out for the WL7 upgrade :o
 
OH! Also, here's something I never noticed in four years of using WL5: The Fade tool under Process only applies 16 bit fades, apparently without any dither, even on a 24 bit file!
Could you elaborate on that a bit, Jon? I'm not sure I understand the mechanism there; are you saying it leaves the last 8 bits alone, just fading the top 16 until all that's left is bottom 8 bits of signal at the end?

G.
 
As a slight detour could/would someone answer Bushmaster's question? It isn't related to the high level discussion but would keep the rest of us more attuned to the discussion - you know bread & circuses?
I'm learning as I read - so please continue.
 
As a slight detour could/would someone answer Bushmaster's question? It isn't related to the high level discussion but would keep the rest of us more attuned to the discussion - you know bread & circuses?
I'm learning as I read - so please continue.

Of course! You use the dither setting for the new bit depth--16 bit dither when going from 24 bit to 16 bit. You apply dither, then truncate.

24 bit dither would be for truncating 32 bit float to 24 bit. This is usually done by automatically at the master section of the DAW, after the fader. At least that's the idea . . . although the corresponding QD would be quiet indeed, at levels 48dB below what we're talking about here.
 
Could you elaborate on that a bit, Jon? I'm not sure I understand the mechanism there; are you saying it leaves the last 8 bits alone, just fading the top 16 until all that's left is bottom 8 bits of signal at the end?

G.

No, I'm saying it's a 16 bit process, truncating the 24 bit file down to 16 bit without dither before returning the faded (and distorted) audio back to the 24 bit file. This can be verified using the bit meter.

I don't know if you're real familiar with WL, but pretty much everybody does work in its Montage feature. Montage doesn't do that, it operates properly. Before Montage was introduced (v4? I didn't use v3), WL had a bunch of audio tools accessed via the menu bar, which can only process a single audio file, and perform destructive edits (with unlimited undo). There are some cool features there, but a lot of them as of v5 were kinda "legacy", so I don't know if they have received the same attention as the Montage feature once it was introduced.

I would go over to the WL board and search/ask, but I'm nearly two versions behind now, so . . .
 
No, I'm saying it's a 16 bit process, truncating the 24 bit file down to 16 bit without dither before returning the faded (and distorted) audio back to the 24 bit file. This can be verified using the bit meter.
Gotcha. I shoulda fingered that one out myself, but I brain locked (and I couldn't unlock it using kgb because I refuse to voluntarily use phone text messaging ;).)
I don't know if you're real familiar with WL
Not really. I'm familiar enough with it's general capabilities and can get my way around it if I need to, but all I have myself in that regard is an OOOLLLLD version of WL Lite (v2.5).

I grew up on Sonic Foundry, so that's what I'm more comfortable with; so I'm more of a Sound Forge/CD Architect guy when it comes to two tracking (with Vegas for video.) That's not an endorsement or a technical preference, necessarily, just what still feels comfortable in my hand after all these years.

G.
 
Back
Top