JoeNovice said:
Which A/D/C have you used and what did you use them with? That's a very bold statement without much fact for backing it up. Have you ever used the Apogee BigBen as a master house clock?
A bold statement, yes, but since you can back it up with simple math, the question of real-world proof is moot.
Proof I: Nyquist proof
For jitter to be audible, the error in timing for a sample must be off by at least enough to result in an error of some significant period of a wave within the range of human hearing. An error of about a tenth of a wavelength is provably the
minimum drift that should result in audible artifacts within the human hearing range unless you want to try to argue that the Nyquist theorem is wrong. That said, if you don't believe me, I'll back that statement up with mathematical proof.
At a 44.1 kHz sample rate, this means an error of somewhere on the order of 20% of a sample, which would require an insanely bad clock. Even things like S/PDIF or AES/EBU only drift half that much (at most) At 88.2/96 kHz, this would require an error of, at minimum, almost half a sample. I don't think that there's an audio interface clock on the planet with that much drift---even in really early hardware from decades back.
Proof II: Decibel proof
For a real-world example, we'll use a 16 kHz audio tone---right at the top limit for average people's hearing. Sample at 96 kHz. (Note: I picked 16 because it divides evenly into 96, which makes the math easier. You can do the math for 22 kHz if you really want to do so, but it really won't make any difference.)
Let's use a standard with relatively high clock jitter allowed in the spec. The AES/EBU standard allows 20 ns of jitter. That's about as bad as jitter gets, since your internal clock is likely to jitter in the low picosecond range, so 20 nanoseconds is huge by modern standards. So keep in mind throughout the math that follows that you will almost certainly never see jitter this bad in any real-world hardware.
Before I do the math, let me also add that 20 nanoseconds is 50 MHz. That's Megahertz, as in 20,000 times the top of the human hearing range. This should
already give you a pretty good idea of why discussions of jitter are silly with modern electronics.
The maximum error will occur when the sine wave is moving at its maximum speed, which would be at the zero crossing. An error of 20 nanoseconds on a 16 kHz represents 1/3125th of a cycle (50 million / 16 thousand). The error would be about +/-0.002, or 0.0174 dB. The minimum difference needed to be audible is an error of about 3 dB. That means that it is impossible for jitter to be audible within the human hearing range even with a with a
huge jitter by modern standards. And note that this is the absolute maximum possible error at any point on the curve at 16 kHz sampled at 96 kHz.
It gets better. The reality of the matter is that you're only sampling at six points on the cycle. Depending on where that falls within the sample, the roll-off at 16 kHz can be as much as 1/12th of a wavelength, or about 3 dB. So the maximum error from jitter is less than the maximum error due to the sampling itself---250 times less, to be precise.
It gets still better. Do that with a more typical error of 2 ns (for clock recovered from ADAT), and the jitter results in an error of 1/31250th of a cycle, or about 0.0002, or about 0.00174 dB, or 1/2500th the error caused by the sampling itself! Not only can you not hear an error that small, odds are, you can't even measure it on an oscilloscope. The difference is that small. At +4 dB line level audio (1.228 vPP or so), that comes out to an error of 0.2 mV, or a fifth the distance between divisions on a good oscilloscope at its maximum precision.
Now do this for a 2 picosecond internal clock jitter. What is the maximum error in decibels? Hint: a nanosecond is 1000 picoseconds. Did you answer 0.000000174 dB? If you did, you're right!
Proof III: Interface design proof
Mathematical theory aside, the way audio interfaces work removes jitter from the picture by its very nature. Audio interfaces use a phase-locked loop to create a stable internal clock based on an external clock. The way this works is that each consecutive clock moves towards the incoming clock until it represents an average of the incoming clock.
Even with a modestly jittery clock signal coming from an outside source, the PLL should still be relatively solid, but will exhibit tracking jitter even if the clock is absolutely perfect. The catch is that even a theoretically perfect clock will not be perfect by the time it reaches the audio interface. The act of passing through the cable, amplifiers on output, amplifiers on input, etc. will cause the pulse edges to degrade.
The kicker is this: even with good clock recovery, an external clock source derived from a recovered (external) clock will always exhibit greater jitter than a physically closer crystal-driven clock because of this degradation. Thus, an external clock automatically makes jitter worse, regardless of whose clock, regardless of whose interface, regardless of whether it is sent as word clock, S/PDIF, AES/EBU, or ADAT, and regardless of all other factors.
Now the argument the audiophiles give is "phase distortion, and technically, they are correct. It does produce phase distortion, but as you can see plainly from the math, the error at any given point in time is inaudible by many, many orders of magnitude, and thus, the error is inaudible, period.
The only argument that could reasonably be plausible is one in which you are recording multiple audio streams with different interfaces in which at least one is externally clocked. In that case, you could have a very slight phase offset that might be improved by slaving both interfaces off of a single external clock rather than to each other. However, the phase offset from clock recovery will likely pale compared to the multi-interface timing discrepancies inherent in the data path after the information leaves your interface due to time stamping precision (or lack thereof), and thus, this error can be discounted just as easily.
In short, external clocks serve a purpose. They produce a single stable clock signal in multiple formats to drive different devices. If your purpose for using an external clock is to make your recordings somehow sound better, though, repeat after me: it's all in your head.