lpdeluxe said:
Dithering comes into play when you are converting the sample rate.
Not true. Dithering refers to changing the
bit depth not sample rate.
What you are talking about is
sample
rate
conversion, (SRC)which changes 44.1khz to 48khz and so on....
Dithering should be the last step in editing/mastering...because normally before dithering, we have a bigger file with more bit depth.more bit depth= more information to work with and edit. A bigger canvas, as it were.
=========================================================
Spread your fingers and hold them up a few inches in front of one eye, and close the other. Try to read this text. Your fingers will certainly block portions of the text (the smaller the text, the more you'll be missing), making reading difficult.
Wag your hand back and forth (to and fro!) quickly. You'll be able to read all of the text easily. There'll be the blur (really more of a strobe effect, due to the scanning of the monitor) of your hand in front of the text, but definitely an improvement over what we had before.
The blur is analogous to the noise we add in dithering. We trade off a little added noise for a much better picture of what's underneath.
For audio, dithering is done by adding noise of a level less than the least-significant bit before rounding to 16 bits. The added noise has the effect of spreading the many short-term errors across the audio spectrum as broadband noise. We can make small improvements to this dithering algorithm (such as shaping the noise to areas where it's less objectionable), but the process remains simply one of adding the minimal amount of noise necessary to do the jo
The final version of audio that goes onto a compact disc contains only 16 bits, but throughout the audio editing process the digital data grows in bit depth as computation occurs. The more math we do, the larger the samples grow in bit depth — just like adding, multiplying, or dividing decimal numbers. In the end, the digital data must be returned to 16 bits for pressing onto a CD and distributing.
There are multiple ways in which one can return the data to 16 bits. They can, for example, simply lop off the excess bits — called truncation. They can also round the excess bits to the nearest value. Each of these methods, however, results in predictable and determinable errors in the result. Take, for example, a waveform that consists of the following values:
1 2 3 4 5 6 7 8
If we reduce our waveform by, say, 20% then we end up with the following values:
0.8 1.6 2.4 3.2 4.0 4.8 5.6 6.4
If we truncate these values we end up with the following data:
0 1 2 3 4 4 5 6
If we instead round these values we end up with the following data:
1 2 2 3 4 5 6 6
Any waveform comprised of the original values, then processed by multiplying each value by .8, would have errors in it in the result, and the errors would be manifested as repeatable. A repeating sine wave quantized to the original sample values, for example, would experience the same error every time its supposed value was "3.4" in that the truncated result would be off by .4. Any time the supposed value was "5" the error after processing and truncation would be 0. Therefore, the error amount would change repeatedly as the values change. The result is cyclical behavior in the error, which manifests itself as additional frequency content on the waveform (harmonic distortion). The ear hears this as distortion, or the presence of additional frequency content.
We cannot avoid error resulting in this process. Taking a 2 digit number (4.8) and turning it into a 1 digit number (4 or 5) is going to result in error, and that is unavoidable. What we want to do, however, is create a system wherein that error does not repeat as the values repeat.
A plausible solution would be to take the 2 digit number (say, 4.8) and round it one direction or the other. For example, we could round it to 5 one time and then 4 the next time, etc., etc. This would make the long-term average 4.5 instead of 4, so that over the long-term the value is closer to its actual value. This, on the other hand, still results in determinable (though more complicated) error. Every other time the value 4.8 comes up the result is an error of .2, and the other times it is -.8. This still results in repeating, quantifiable error.
Another plausible solution would be to take 4.8 and round it so that four times out of five it rounded up to 5, and the other time it rounded to 4. This would average out to exactly 4.8 over the long term. Unfortunately, however, it still results in repeatable and determinable errors, and those errors still manifest themselves as distortion to the ear (though oversampling can reduce this).
This leads to the dither solution. Rather than predictably rounding up or down in a repeating pattern, what if we rounded up or down in a random pattern? If we came up with a way to randomly toggle our results between 4 and 5 so that 80% of the time it ended up on 5 then we would average 4.8 over the long run but would have random, unrepeating error in the result. This is done through dither.
We calculate a series of random numbers between 0 and .9 (ex: .6, .4, .5, .3, .7, etc.) and we add these random numbers to the results of our equation. Two times out of ten the result will truncate back to 4 (if 0 or .1 are added to 4.8) and the rest of the times it will truncate to 5, but each given situation has a random, but 20% chance of rounding to 4 or 80% chance of rounding to 5. Over the long haul this will result in results that round to 4.8 and a quantization error that is random — or noise. This "noise" result is less offensive to the ear than the determinable distortion that would result otherwise.